Comparing Data Between Oracle Databases Using Ansible

As an Oracle DBA, verifying the integrity and consistency of data across different environments is crucial. In this article, we will go through a step-by-step approach to automate the comparison of data between two Oracle databases using Ansible. We’ll leverage ORA_HASH and other methods like MD5 and SHA1 hashing for comprehensive data validation.

Prerequisites

Ensure that you have ansible installed. You need Oracle databases to compare, of course. ⁠Smile

Ansible Playbook

Below is an Ansible playbook that performs the following tasks:

  • Connects to source and target databases.
  • Executes queries to compute checksums for specified tables and columns.
  • Compares the results.
  • Sends a consolidated email notification based on the comparison outcome.
---
- name: Compare data between Oracle databases
  hosts: localhost
  gather_facts: no
  vars:
    source_db:
      host: "source_db_host"
      os_user: "oracle"
      oracle_home: "/u01/app/oracle/product/19.0.0/dbhome_1"
      sid: "ORCL"
      pdb: "pdb1"
    target_db:
      host: "target_db_host"
      os_user: "oracle"
      oracle_home: "/u01/app/oracle/product/19.0.0/dbhome_1"
      sid: "ORCL"
      pdb: "pdb1"

    tables_to_compare:
      - name: "table1"
        primary_key: "id"
        columns: ["col1", "col2"]
        condition: "col_100 not like '000xx%'"
      - name: "table2"
        primary_key: "id"
        columns: ["colX", "colY"]
        condition: ""
      # Add more tables as needed

  tasks:
    - name: Build query string for source database
      set_fact:
        source_query: >-
          SELECT {{ item.primary_key }} AS PK_VALUE, {{ item.primary_key }} || ORA_HASH({{ item.columns | join(' || ORA_HASH(') + ')' }}) AS MD5_STR 
          FROM {{ item.name }} 
          WHERE {{ item.columns | join(' IS NOT NULL AND ') }} IS NOT NULL 
          {% if item.condition %} AND {{ item.condition }} {% endif %}

    - name: Query checksum for each table in source_db
      become: yes
      become_user: "{{ source_db.os_user }}"
      environment:
        ORACLE_HOME: "{{ source_db.oracle_home }}"
        ORACLE_SID: "{{ source_db.sid }}"
        ORACLE_PDB_SID: "{{ source_db.pdb }}"
      shell:
        cmd: |
          {{ source_db.oracle_home }}/bin/sqlplus -s "/ as sysdba" <<SQLPLUS
          SET PAGESIZE 0 FEEDBACK OFF VERIFY OFF HEADING OFF ECHO OFF
          {{ source_query }}
          /
          exit;
          SQLPLUS
      register: source_result
      loop: "{{ tables_to_compare }}"
      delegate_to: "{{ source_db.host }}"

    - name: Build query string for target database
      set_fact:
        target_query: >-
          SELECT {{ item.primary_key }} AS PK_VALUE, {{ item.primary_key }} || ORA_HASH({{ item.columns | join(' || ORA_HASH(') + ')' }}) AS MD5_STR 
          FROM {{ item.name }} 
          WHERE {{ item.columns | join(' IS NOT NULL AND ') }} IS NOT NULL 
          {% if item.condition %} AND {{ item.condition }} {% endif %}

    - name: Query checksum for each table in target_db
      become: yes
      become_user: "{{ target_db.os_user }}"
      environment:
        ORACLE_HOME: "{{ target_db.oracle_home }}"
        ORACLE_SID: "{{ target_db.sid }}"
        ORACLE_PDB_SID: "{{ target_db.pdb }}"
      shell:
        cmd: |
          {{ target_db.oracle_home }}/bin/sqlplus -s "/ as sysdba" <<SQLPLUS
          SET PAGESIZE 0 FEEDBACK OFF VERIFY OFF HEADING OFF ECHO OFF
          {{ target_query }}
          /
          exit;
          SQLPLUS
      register: target_result
      loop: "{{ tables_to_compare }}"
      delegate_to: "{{ target_db.host }}"

    - name: Parse SQL query results
      set_fact:
        comparison_results: >-
          {{
            comparison_results | default([]) +
            [{
              "table": item.name,
              "source_hash": (source_result.results[item_index].stdout_lines[0] if source_result.results[item_index].stdout_lines else None),
              "target_hash": (target_result.results[item_index].stdout_lines[0] if target_result.results[item_index].stdout_lines else None),
              "status": (
                "Match" if
                (source_result.results[item_index].stdout_lines[0] if source_result.results[item_index].stdout_lines else None) ==
                (target_result.results[item_index].stdout_lines[0] if target_result.results[item_index].stdout_lines else None)
                else "Mismatch"
              )
            }]
          }}
      loop: "{{ tables_to_compare }}"
      loop_control:
        index_var: "item_index"

    - name: Determine email subject based on comparison results
      set_fact:
        email_subject: >-
          {{
            "Unscrambled data found" if
            (comparison_results | selectattr('status', 'equalto', 'Match') | list | length) > 0
            else "Data scrambling validated"
          }}

    - name: Send email with comparison results
      mail:
        host: "your_smtp_server"
        port: 25
        from: "ansible@itnoesis.com"
        to: "admin@itnoesis.com"
        subject: "{{ email_subject }}"
        body: |
          The following tables were compared:

          {% for result in comparison_results %}
          Table: {{ result.table }}
          Source Hash: {{ result.source_hash }}
          Target Hash: {{ result.target_hash }}
          Status: {{ result.status }}
          {% endfor %}

Additional Methods for Data Hashing

In addition to using ORA_HASH, there are other methods you can use to hash and compare data:

1. Using DBMS_CRYPTO with MD5

The DBMS_CRYPTO package provides various hashing algorithms, including MD5 and SHA1.

SELECT
  LOWER(RAWTOHEX(DBMS_CRYPTO.HASH(UTL_RAW.CAST_TO_RAW({{ item.columns | join(' || ') }}), DBMS_CRYPTO.HASH_MD5))) AS hash_value
FROM {{ item.name }}
WHERE {{ item.columns | join(' IS NOT NULL AND ') }} IS NOT NULL 
{% if item.condition %} AND {{ item.condition }} {% endif %}

2. Using DBMS_OBFUSCATION_TOOLKIT

Another package for hashing in Oracle is DBMS_OBFUSCATION_TOOLKIT.

SELECT
  LOWER(RAWTOHEX(DBMS_OBFUSCATION_TOOLKIT.MD5(input => UTL_RAW.CAST_TO_RAW({{ item.columns | join(' || ') }})))) AS hash_value
FROM {{ item.name }}
WHERE {{ item.columns | join(' IS NOT NULL AND ') }} IS NOT NULL 
{% if item.condition %} AND {{ item.condition }} {% endif %}

How to use the Playbook

  1. Configure the Variables: Replace placeholder values with actual database connection details and SMTP server settings.
  2. Define Tables and Columns: Adjust the tables_to_compare variable with your table and column names.
  3. Execute the Playbook: Run the playbook using the ansible-playbook command.
ansible-playbook compare_data.yml -i <inventory> <your additional options>

Discover more from IT-Noesis

Subscribe to get the latest posts sent to your email.

Leave a comment