
In part-1 of this series, we talked about the following features,
- JSON Relational Duality
- Automatic Indexing Enhancements
- Blockchain Tables
- Native JSON Data Type
- SQL Macro
In this article, let us have a look at the next set of features:
- Oracle Machine Learning Enhancements
- Enhanced Support for Python
- New Machine Learning Algorithms
- Automated Machine Learning (AutoML)
- Improved Model Deployment and Management
- Integration with Oracle Autonomous DB
- Enhanced SQL APIs for Machine Learning
- Improved Data Preparation and Feature Engineering
- Integration with Oracle Data Science
- In-Memory Column Store Enhancements
- Dynamic In-Memory Column Store Management
- In-Memory Hybrid Columnar Compression
- In-Memory Aggregation Enhancements
- In-Memory Expressions
- Support for JSON Data
- Automatic In-Memory Population
- In-Memory Query Rewrite
- Automatic SQL Plan Management (SPM)
- High-Frequency Statistics Collection
- Automatic Index Optimization
Oracle Machine Learning Enhancements
There have been significant enhancements to Oracle Machine Learning (OML), these enhancements include improved support for Python, streamlined workflows for model deployment, new algorithms, and better integration with Oracle Autonomous Database. These features empower data scientists, analysts, and developers to build, deploy, and manage machine learning models directly within the Oracle Database ecosystem. Let us look at them one by one.
Enhanced Support for Python:
OML for Python allows users to build machine learning models using Python directly within the Oracle Database. This eliminates the need to move data outside the database for analysis and model building. Improved integration with popular Python libraries such as Pandas, NumPy, and Scikit-learn. Below is an example of OML for python where data is being loaded in a Pandas DataFrame using the ‘oml.sklearn.query’ function; the data is split into training sets; the model is trained and predictions are made and accuracy is calculated.
import oml
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
# Load data from Oracle Database into a Pandas DataFrame
df = oml.sync.query('SELECT * FROM employees')
# Split data into training and testing sets
train, test = train_test_split(df, test_size=0.2)
# Train a RandomForest model
clf = RandomForestClassifier()
clf.fit(train.drop('target', axis=1), train['target'])
# Make predictions
predictions = clf.predict(test.drop('target', axis=1))
# Calculate accuracy
accuracy = accuracy_score(test['target'], predictions)
print(f'Accuracy: {accuracy:.2f}')
New Machine Learning Algorithms:
Introduction of new algorithms to support a broader range of machine learning tasks, including deep learning, time series analysis, and natural language processing. Enhanced performance and scalability for existing algorithms. Below an example of creating a data mining machine learning model.
-- Create a model using the new algorithm (e.g., XGBoost)
BEGIN
DBMS_DATA_MINING.CREATE_MODEL(
model_name => 'xgboost_model',
mining_function => DBMS_DATA_MINING.CLASSIFICATION,
data_table_name => 'employees',
case_id_column_name => 'employee_id',
target_column_name => 'target',
settings_table_name => 'xgboost_settings'
);
END;
/
Automated Machine Learning (AutoML)
AutoML features simplify the process of selecting, training, and tuning machine learning models. It automates model selection and hyperparameter tuning, making it easier to build high-quality models with minimal manual intervention. Below is an example of creating a new AutoML model.
-- Create an AutoML model
BEGIN
DBMS_DATA_MINING.CREATE_MODEL(
model_name => 'automl_model',
mining_function => DBMS_DATA_MINING.CLASSIFICATION,
data_table_name => 'employees',
case_id_column_name => 'employee_id',
target_column_name => 'target',
settings_table_name => 'automl_settings'
);
END;
/
Improved Model Deployment and Management
Simplified workflows for deploying and managing machine learning models within the Oracle Database.Support for model versioning, allowing users to manage different versions of models and roll back to previous versions if necessary.
-- Deploy a machine learning model for scoring
BEGIN
DBMS_DATA_MINING.DEPLOY_MODEL(
model_name => 'xgboost_model',
deployment_name => 'deployed_xgboost_model'
);
END;
/
-- Use the deployed model for scoring
SELECT employee_id, PREDICTION(deployed_xgboost_model USING *) AS prediction
FROM employees;
Integration with Oracle Autonomous Database
Seamless integration with Oracle Autonomous Database, leveraging its automated features for performance tuning, scaling, and patching. Enhanced support for real-time scoring and model deployment in Oracle Autonomous Database.
Enhanced SQL APIs for Machine Learning
New and improved SQL APIs for training, evaluating, and deploying machine learning models. This allows users to leverage SQL skills to perform machine learning tasks directly within the database. Support for using SQL to perform feature engineering, data preprocessing, and model evaluation.
-- Create a model using SQL
BEGIN
DBMS_DATA_MINING.CREATE_MODEL(
model_name => 'sql_model',
mining_function => DBMS_DATA_MINING.CLASSIFICATION,
data_table_name => 'employees',
case_id_column_name => 'employee_id',
target_column_name => 'target',
settings_table_name => 'sql_model_settings'
);
END;
/
-- Evaluate the model using SQL
SELECT PREDICTION(sql_model USING *) AS prediction, target
FROM employees_test;
Improved Data Preparation and Feature Engineering
Enhanced tools for data preparation and feature engineering, including new functions for data transformation, normalization, and aggregation. Support for automated feature engineering using OML.
-- Normalize data using SQL
SELECT NORMALIZE(salary) AS normalized_salary
FROM employees;
-- Feature engineering using SQL
SELECT
employee_id,
CASE
WHEN salary > 50000 THEN 'High'
ELSE 'Low'
END AS salary_category
FROM employees;
Integration with Oracle Data Science
Enhanced integration with Oracle Data Science, providing a unified platform for building, deploying, and managing machine learning models. Support for collaborative workflows, allowing data scientists to work together on model development and deployment.
In-Memory Column Store Enhancements
There are several enhancements to the In-Memory Column Store (IMCS), that enables real-time analytics and fast querying by storing data in a columnar format in memory.
Dynamic In-Memory Column Store Management:
Automatically adjusts the size of the In-Memory Column Store based on workload requirements, and reduces the need for manual configuration and tuning. Below is the code to enable dynamic re-sizing of the IMC store.
ALTER SYSTEM SET INMEMORY_SIZE = 0 SCOPE = BOTH;
ALTER SYSTEM SET INMEMORY_FORCE = DEFAULT SCOPE = BOTH;
In-Memory Hybrid Columnar Compression:
Improves compression ratios for data stored in the In-Memory Column Store, and reduces memory footprint and enhances query performance by minimizing I/O. This is how hybrid comlumnar compression can be enabled for an existing table.
ALTER TABLE employees INMEMORY MEMCOMPRESS FOR QUERY LOW;
In-Memory Aggregation Enhancements:
Optimizes performance for complex analytical queries that involve aggregations. New algorithms and techniques are included to speed up aggregation operations.
In-Memory Expressions:
This allows frequently used expressions to be materialized and stored in the In-Memory Column Store which improves query performance by avoiding repeated computation of expressions. Below is an example of materializing an expression in the in-memory column store.
ALTER TABLE employees INMEMORY MEMCOMPRESS FOR QUERY LOW;
ALTER TABLE employees MODIFY (salary + commission_pct) INMEMORY;
Support for JSON Data:
Enables efficient storage and querying of JSON data in the In-Memory Column Store by leveraging the power of IMCS to accelerate JSON-based analytics.
CREATE TABLE json_table (
id NUMBER,
json_data CLOB
) INMEMORY;
INSERT INTO json_table VALUES (1, '{"name": "John", "age": 30}');
INSERT INTO json_table VALUES (2, '{"name": "Jane", "age": 25}');
SELECT json_value(json_data, '$.name') AS name
FROM json_table;
Automatic In-Memory Population:
Automatically populates the In-Memory Column Store with frequently accessed data which in turn reduces the need for manual intervention and tuning. Use the following query to enable automatic population of frequently accessed data in-memory.
ALTER SYSTEM SET INMEMORY_AUTOMATIC_LEVEL = HIGH;
In-Memory Query Rewrite:
Automatically rewrites queries to take advantage of the In-Memory Column Store. The query execution plans are optimized to leverage columnar storage and indexing.
Automatic SQL Plan Management (SPM)
Automatic SPM automates the process of capturing, validating, and evolving SQL execution plans to maintain consistent performance, even as database environments change. Consider a scenario where a database upgrade introduces new execution plans for critical SQL statements. Automatic SQL Plan Management can help by capturing these new plans and validating them against existing SQL plan baselines. Only plans that offer performance improvements are accepted, ensuring that the upgrade does not negatively impact SQL performance. After burning my hands by upgrades negatively impacting query performance, a much needed enhancement in my opinion.
Below is the step by step procedure to leverage automatic SPM.
Enable ASPM:
ALTER SYSTEM SET optimizer_capture_sql_plan_baselines = TRUE;
ALTER SYSTEM SET optimizer_use_sql_plan_baselines = TRUE;
Execute the critical sql statements or let them get executed overtime. Below is an example of not so critical sql statement but hey.. 😊
SELECT * FROM employees WHERE department_id = 10;
View captured plans:
SELECT * FROM dba_sql_plan_baselines;
Evolve the SQL Plan baselines to validate new plans.
DECLARE
l_report CLOB;
BEGIN
l_report := DBMS_SPM.evolve_sql_plan_baseline(
sql_handle => 'SYS_SQL_1234567890abcdef');
DBMS_OUTPUT.put_line(l_report);
END;
/
SELECT * FROM dba_sql_plan_baselines;
High-Frequency Statistics Collection
Traditionally, Oracle Database collects statistics at regular intervals, which might not capture rapid changes in data distribution, leading to suboptimal execution plans. High-Frequency Statistics Collection addresses this by enabling more frequent and targeted statistics gathering, ensuring that the optimizer has the most current data to generate efficient execution plans.
Consider a scenario where a retail database experiences significant changes in sales data due to seasonal fluctuations. High-Frequency Statistics Collection can ensure that the optimizer has the most current statistics, enabling it to generate optimal execution plans even as data distributions change rapidly.
Enable Automatic and Adaptive Statistics Collection
BEGIN
DBMS_AUTO_TASK_ADMIN.enable(
client_name => 'auto optimizer stats collection',
operation => NULL,
window_name => NULL);
END;
/
ALTER SYSTEM SET optimizer_adaptive_features = TRUE;
ALTER SYSTEM SET optimizer_dynamic_sampling = 11;
Configure Incremental and Partition-Level Statistics Collection
ALTER TABLE sales
MODIFY PARTITION BY RANGE (sale_date)
(
PARTITION p2023 VALUES LESS THAN (TO_DATE('01-JAN-2024', 'DD-MON-YYYY'))
INCREMENTAL MAINTAIN GLOBAL INDEXES
);
BEGIN
DBMS_STATS.gather_table_stats(
ownname => 'SH',
tabname => 'SALES',
method_opt => 'FOR ALL COLUMNS SIZE AUTO',
incremental => TRUE,
granularity => 'AUTO');
END;
/
Gather and View Statistics
BEGIN
DBMS_STATS.gather_table_stats(
ownname => 'SH',
tabname => 'SALES',
partname => 'P2023',
method_opt => 'FOR ALL COLUMNS SIZE AUTO',
granularity => 'PARTITION');
END;
/
SELECT column_name, histogram, num_buckets
FROM dba_tab_col_statistics
WHERE table_name = 'SALES';
Automatic Index Optimization
And lastly (for this article) “Automatic Index Optimization”. We all know that indexes are critical for database performance as they significantly speed up query execution. However, managing indexes manually can be complex and time-consuming. Automatic Index Optimization simplifies this process by monitoring database usage patterns, identifying inefficient or redundant indexes, and optimizing them without requiring manual intervention. Consider a scenario where a large retail database experiences frequent query performance issues due to inefficient indexing. Automatic Index Optimization can help by continuously monitoring the workload, creating new indexes where needed, dropping unused indexes, and rebuilding fragmented ones. Might not be applicable everywhere so I would rather advise to exercise caution and test this in a lower environment first.
Enable automatic indexing:
ALTER SYSTEM SET "_automatic_index_creation" = TRUE;
ALTER SYSTEM SET "_automatic_index_dropping" = TRUE;
ALTER SYSTEM SET "_automatic_index_rebuilding" = TRUE;
ALTER SYSTEM SET "_adaptive_index_management" = TRUE;
Monitor:
SELECT index_name, table_name, column_name
FROM dba_indexes
WHERE index_type = 'AUTOMATIC';
SELECT * FROM dba_hist_dropped_indexes;
SELECT * FROM dba_adaptive_index_actions;
Use index advisor:
BEGIN
DBMS_AUTO_INDEX.configure('AUTO_INDEX_MODE', 'IMPLEMENT');
END;
/
SELECT * FROM dba_index_advisor_recommendations;
That brings us to the end of this article of the 04-part series. In the next article, we will take a closer look at:
- Multilingual Engine (MLE)
- Enhanced Data Masking and Redaction
- Automatic Data Optimization (ADO) Enhancements
- Graph Studio Enhancements
- Partitioned Hybrid Tables
Hope this helps!
Discover more from IT-Noesis
Subscribe to get the latest posts sent to your email.

3 thoughts on “Top 20 Features in Oracle Database 23c for DBAs – Part-2”