JDBC formula reference

Overview

This document provides a comprehensive reference for JDBC formulas supported in DataGOL. These formulas enable powerful data processing, analysis, and transformation capabilities directly within your DataGOL workflows.

What you can achieve?

Data aggregation: Calculate sums, averages, counts, and statistical measures
Text processing: Clean, format, and manipulate string data
Date/Time operations: Parse, format, and calculate with dates and timestamps
Conditional logic: Implement business rules and data validation
Mathematical operations: Perform calculations and statistical analysis

Mathematical functions

ABS - Absolute Value

Purpose:

Removes negative signs from numbers.

Usage:

ABS(number)  

Example:

returns 25

ABS(-25)

Use Case:

Calculating distances, finding differences without caring about direction, cleaning up negative values in reports.

Mathematical functions

CEIL - Round Up

Purpose:

Always rounds numbers up to the next whole number.

Usage:

CEIL(number)

Example:

Returns 5

CEIL(4.1)

Use Case:

Used for calculating minimum packages needed, determining required resources, and inventory planning.

FLOOR - Round Down

Purpose:

Always rounds numbers down to the previous whole number.

Usage:

FLOOR(number)

Example:

Returns 4

FLOOR(4.9)

Use Case:

Useful for calculating completed units, determining available capacity, and budget calculations.

ROUND - Smart Rounding

Purpose:

Rounds numbers to the nearest whole number or specified decimal places.

Usage:

ROUND(number, decimal_places)

Example:

Returns 123.46

ROUND(123.456, 2)

Use Case:

Applied in financial calculations, displaying prices, and creating clean reports.

POWER - Exponents

Purpose:

Multiplies a number by itself a specified number of times.

Usage:

POWER(base, exponent)

Example:

Returns 8 (2×2×2)

POWER(2, 3)

Use Case:

Used for compound interest calculations, growth projections, and area/volume calculations.

SQRT - Square Root

Purpose:

Finds what number multiplied by itself equals the given number.

Usage:

SQRT(number)

Example:

Returns 4

SQRT(16)

Use Case:

Applied in statistical calculations, geometric measurements, and risk analysis.

LN - Natural Logarithm

Purpose:

Mathematical function for advanced calculations and modeling.

Usage:

LN(number)

Example:

Returns 2.302585

LN(10)

Use Case:

Used in scientific calculations, decay models, and financial modeling.

PI - Pi Constant

Purpose:

Returns the mathematical constant $\pi$ (3.14159...).

Usage:

PI()

Example:

Returns 3.14159

PI()

Use Case:

Used in circle calculations, geometric formulas, and engineering calculations.

MOD - Remainder

Purpose:

Shows what's left over after division.

Usage:

MOD(dividend, divisor)

Example:

Returns 1 (10÷3 = 3 remainder 1)

MOD(10, 3)

Use Case:

Useful for creating alternating patterns, pagination, and scheduling rotations.

Statistical functions

AVG - Average

Purpose:

Calculates the mean value of a group of numbers.

Usage:

AVG(column_name)

Example:

Might return 50000

AVG(salary)

Use Case:

Used for performance metrics, salary analysis, quality scores, and customer ratings.

SUM - Total

Purpose:

Adds up all numbers in a group.

Usage:

SUM(column_name)

Example:

Might return 125000

SUM(sales)

Use Case:

Used for revenue totals, expense calculations, inventory counts, and performance totals.

COUNT - Count Items

Purpose:

Counts how many items exist (ignores empty values).

Usage:

COUNT(column_name)

Example:

Might return 150

COUNT(customer_id)

Use Case:

Used for customer counts, order quantities, active users, and completion rates.

MIN - Smallest Value

Purpose:

Finds the lowest number in a group.

Usage:

MIN(column_name)

Example:

Might return 9.99

MIN(price)

Use Case:

Used for lowest prices, minimum scores, earliest dates, and budget floors.

MAX - Largest Value

Purpose:

Finds the highest number in a group.

Usage:

MAX(column_name)

Example:

Might return 98

MAX(score)

Use Case:

Used for peak performance, highest prices, latest dates, and record highs.

MEDIAN - Middle Value

Purpose:

Finds the middle number when all values are arranged in order.

Usage:

MEDIAN(column_name)

Example:

Might return 45000

MEDIAN(income)

Use Case:

Used for salary benchmarks, performance baselines, and market analysis.

STDDEV - Standard Deviation

Purpose:

Measures how spread out numbers are from the average.

Usage:

STDDEV(column_name)

Example:

Might return 2.5

STDDEV(response_time)

Use Case:

Used for quality control, performance consistency, and risk assessment.

VARIANCE - Variance

Purpose:

Measures how much numbers vary from the average (squared differences).

Usage:

VARIANCE(column_name)

Example:

Might return 1250000

VARIANCE(sales)

Use Case:

Used for financial analysis, quality metrics, and forecasting accuracy.

Text functions

CONCAT - Join Text

Purpose:

Combines multiple pieces of text into one.

Usage:

CONCAT(text1, text2, text3...)

Example:

Returns "Hello World"

CONCAT('Hello', ' ', 'World')

Use Case:

Used for creating full names, building addresses, generating reports, and email formatting.

UPPER - Make Uppercase

Purpose:

Converts all letters to capital letters.

Usage:

UPPER(text)

Example:

Returns "HELLO"

UPPER('hello')

Use Case:

Used for standardizing data entry, creating consistent formats, and report headers.

LOWER - Make Lowercase

Purpose:

Converts all letters to small letters.

Usage:

LOWER(text)

Example:

Returns "hello"

LOWER('HELLO')

Use Case:

Used for email addresses, usernames, data cleaning, and search functions.

INITCAP - Capitalize Words

Purpose:

Makes the first letter of each word uppercase.

Usage:

INITCAP(text)

Example:

Returns "John Smith"

INITCAP('john smith')

Use Case:

Used for proper names, titles, address formatting, and professional correspondence.

TRIM - Remove Extra Spaces

Purpose:

Removes spaces from the beginning and end of text.

Usage:

TRIM(text)

Example:

Returns "Hello World"

TRIM(' Hello World ')

Use Case:

Used for data cleaning, form processing, and import/export cleanup.

LTRIM - Remove Left Spaces

Purpose:

Removes spaces only from the beginning of the text.

Usage:

LTRIM(text)

Example:

Returns "Hello"

LTRIM(' Hello')

Use Case:

Used for data formatting, code cleanup, and text alignment.

SUBSTRING - Extract Text Portion

Purpose:

Takes out a specific part of the text.

Usage:

SUBSTRING(text, start_position, length)

Example:

Returns "atab"

SUBSTRING('Database', 2, 4)

Use Case:

Used for extracting codes, parsing data, creating abbreviations, and ID processing.

RIGHT - Get End Characters

Purpose:

Takes a specified number of characters from the end of the text.

Usage:

RIGHT(text, number_of_characters)

Example:

Returns "base"

RIGHT('Database', 4)

Use Case:

Used for file extensions, postal codes, ID suffixes, and validation codes.

REGEXP_REPLACE - Pattern Replacement

Purpose:

Finds and replaces text patterns using advanced matching rules.

Usage:

REGEXP_REPLACE(text, pattern, replacement, flags)

Example:

Returns 1234567890

REGEXP_REPLACE('123-456-7890', '-', '', 'g')

Use Case:

Used for phone number formatting, data cleaning, removing special characters, and standardizing formats.

Date functions

CURRENT_DATE - Today's Date

Purpose:

Returns today's date.

Usage:

CURRENT_DATE

Example:

Returns 2025-01-22

CURRENT_DATE

Use Case:

Used for timestamping records, calculating ages, due date calculations, and report dating.

CURRENT_TIMESTAMP - Current Date and Time

Purpose:

Returns the exact current date and time.

Usage:

CURRENT_TIMESTAMP

Example:

Returns 2025-01-22 10:15:30

CURRENT_TIMESTAMP

Use Case:

Used for transaction logging, audit trails, real-time reporting, and activity tracking.

AGE - Calculate Time Difference

Purpose:

Calculates the number of days between two dates.

Usage:

AGE(end_date, start_date)

Example:

Returns 7

AGE('2025-01-22', '2025-01-15')

Use Case:

Used for customer tenure, project duration, payment delays, and service periods.

DATE_PART - Extract Date Component

Purpose:

Pulls out specific parts of a date (year, month, day, etc.).

Usage:

DATE_PART('part', date)

Example:

Returns 2025

DATE_PART('year', '2025-01-22')

Use Case:

Used for monthly reports, yearly summaries, seasonal analysis, and age calculations.

DATE_TRUNC - Round Date

Purpose:

Rounds dates down to a specific time period.

Usage:

DATE_TRUNC('precision', date)

Example:

Returns 2025-01-01

DATE_TRUNC('month', '2025-01-22')

Use Case:

Used for monthly grouping, quarterly reports, trend analysis, and data aggregation.

Utility functions

COALESCE - Handle Missing Data

Purpose:

Returns the first non-empty value from a list.

Usage:

COALESCE(value1, value2, value3...)

Example:

Returns "Fallback"

COALESCE(NULL, 'Fallback', 'Default')

Use Case:

Used for default values, handling missing data, report formatting, and data validation.

CASE WHEN - Conditional Logic

Purpose:

Creates an if-then-else logic for different scenarios.

Usage:

CASE WHEN condition1 THEN result1 WHEN condition2 THEN result2 ELSE default_result END

Example:

Assigns 'A' if score >= 90, 'B' if score >= 80, else 'C'

CASE WHEN score >= 90 THEN 'A' WHEN score >= 80 THEN 'B' ELSE 'C' END

Use Case:

Used for grade calculations, status assignments, categorization, conditional formatting, and business rules.

Best practices

Naming conventions

Column and table names

--  ✅  GOOD:  Clear,  descriptive  names
SELECT  customer_first_name,  order_total_amount
FROM  customer_orders

--  ❌  AVOID:  Cryptic  abbreviations
SELECT  cust_fn,  ord_tot_amt
FROM  cust_ord

Formula aliases

--  ✅  GOOD:  Meaningful  aliases
SELECT
    CONCAT(first_name,  '  ',  last_name)  AS  full_name,
    AVG(salary)  AS  average_salary,
    COUNT(*)  AS  total_employees

--  ❌  AVOID:  Generic  or  no  aliases
SELECT
    CONCAT(first_name,  '  ',  last_name),
    AVG(salary)  AS  avg1,
    COUNT(*)  AS  cnt

Code formatting

Readable structure

-- ✅ GOOD: Well-formatted and indented
SELECT
    department_name,
    COUNT(employee_id) AS employee_count,
    AVG(salary) AS average_salary,
    MAX(hire_date) AS latest_hire_date
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE hire_date >= '2020-01-01'
GROUP BY department_name
ORDER BY average_salary DESC;

-- ❌ AVOID: Everything on one line
SELECT department_name,COUNT(employee_id),AVG(salary),MAX(hire_date) FROM employees e JOIN departments d ON e.department_id = d.department_id WHERE hire_date >= '2020-01-01' GROUP BY department_name ORDER BY AVG(salary) DESC;

Function spacing

-- ✅ GOOD: Consistent spacing
CASE
    WHEN score >= 90 THEN 'Excellent'
    WHEN score >= 80 THEN 'Good'
    WHEN score >= 70 THEN 'Average'
    ELSE 'Needs Improvement'
END AS performance_rating

-- ❌ AVOID: Inconsistent spacing
CASE WHEN score>=90 THEN'Excellent'WHEN score>=80 THEN 'Good'ELSE'Poor'END

Performance optimization

Use appropriate functions

-- ✅ GOOD: Efficient for large datasets
SELECT
    DATE_TRUNC('month', order_date) AS order_month,
    SUM(total_amount) AS monthly_revenue
FROM orders
GROUP BY DATE_TRUNC('month', order_date);

-- ❌ AVOID: Less efficient
SELECT
    SUBSTRING(order_date::text, 1, 7) AS order_month,
    SUM(total_amount) AS monthly_revenue
FROM orders
GROUP BY SUBSTRING(order_date::text, 1, 7);

Minimize function calls in WHERE clauses

-- ✅ GOOD: Index-friendly
SELECT * FROM orders
WHERE order_date >= '2024-01-01'
AND order_date < '2025-01-01';

-- ❌ AVOID: Prevents index usage
SELECT * FROM orders
WHERE DATE_PART('year', order_date) = 2024;

Null handling

Always plan for null values

-- ✅ GOOD: Explicit NULL handling
SELECT
    customer_name,
    COALESCE(phone_number, 'No phone provided') AS contact_phone,
    COALESCE(email, 'No email provided') AS contact_email
FROM customers;

-- ✅ GOOD: Safe division
SELECT
    product_name,
    CASE
        WHEN total_orders > 0
        THEN ROUND(total_revenue / total_orders, 2)
        ELSE 0
    END AS average_order_value
FROM product_summary;

COUNT function considerations

-- ✅ GOOD: Understand the difference
SELECT
    COUNT(*) AS total_rows, -- Counts all rows
    COUNT(email) AS customers_with_email, -- Counts non-NULL emails
    COUNT(DISTINCT city) AS unique_cities -- Counts unique non-NULL cities
FROM customers;

Data type awareness

Consistent data types

-- ✅ GOOD: Proper type handling
SELECT
    order_id,
    ROUND(total_amount, 2) AS formatted_total, -- Keep as numeric
    TO_CHAR(order_date, 'YYYY-MM-DD') AS formatted_date -- Convert to text when needed
FROM orders;

-- ❌ AVOID: Unnecessary type conversions
SELECT
    order_id::text, -- Unnecessary conversion
    total_amount::text -- Loses numeric properties
FROM orders;

Date handling

-- ✅ GOOD: Timezone aware
SELECT
    order_date,
    DATE_TRUNC('day', order_date AT TIME ZONE 'UTC') AS order_day_utc
FROM orders;

-- ✅ GOOD: Clear date arithmetic
SELECT
    customer_id,
    registration_date,
    AGE(CURRENT_DATE, registration_date) AS customer_tenure_days
FROM customers;

Conditional logic best practices

Clear CASE statements

-- ✅ GOOD: Comprehensive and readable
SELECT
    employee_id,
    salary,
    CASE
        WHEN salary >= 100000 THEN 'Senior'
        WHEN salary >= 75000 THEN 'Mid-Level'
        WHEN salary >= 50000 THEN 'Junior'
        WHEN salary >= 30000 THEN 'Entry-Level'
        ELSE 'Intern'
    END AS salary_band,
    CASE
        WHEN department_id IS NULL THEN 'Unassigned'
        ELSE department_name
    END AS department_display
FROM employees;

Nested functions

-- ✅ GOOD: Logical nesting with comments
SELECT
    customer_id,
    -- Clean and format customer names
    INITCAP(
        TRIM(
            COALESCE(customer_name, 'Unknown Customer')
        )
    ) AS formatted_customer_name
FROM customers;

Aggregation best practices

Meaningful grouping

-- ✅ GOOD: Clear grouping logic
SELECT
    DATE_TRUNC('quarter', order_date) AS quarter,
    product_category,
    COUNT(DISTINCT customer_id) AS unique_customers,
    COUNT(order_id) AS total_orders,
    SUM(order_total) AS total_revenue,
    AVG(order_total) AS average_order_value,
    STDDEV(order_total) AS revenue_volatility
FROM orders o
JOIN products p ON o.product_id = p.product_id
WHERE order_date >= '2024-01-01'
GROUP BY
    DATE_TRUNC('quarter', order_date),
    product_category
HAVING COUNT(order_id) >= 10 -- Only include categories with significant volume
ORDER BY quarter DESC, total_revenue DESC;

Documentation & comments

Self-documenting queries

-- ✅ GOOD: Clear purpose and logic
-- Customer Lifetime Value Analysis
-- Calculates CLV for customers active in the last 2 years
SELECT
    c.customer_id,
    c.customer_name,
    -- Customer tenure in days
    AGE(CURRENT_DATE, c.registration_date) AS customer_age_days,
    -- Purchase behavior metrics
    COUNT(o.order_id) AS total_orders,
    SUM(o.order_total) AS lifetime_revenue,
    AVG(o.order_total) AS average_order_value,
    -- Recency calculation
    AGE(CURRENT_DATE, MAX(o.order_date)) AS days_since last order,
    -- Customer lifetime value estimate
    CASE
        WHEN COUNT(o.order_id) > 0 THEN
            ROUND(
                SUM(o.order_total) /
                NULLIF(AGE(CURRENT_DATE, c.registration_date), 0) * 365,
            2
            )
        ELSE 0
    END AS estimated_annual_value
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE c.registration_date >= CURRENT_DATE - INTERVAL '2 years'
GROUP BY c.customer_id, c.customer_name, c.registration_date
HAVING COUNT(o.order_id) > 0 -- Only customers with purchases
ORDER BY estimated_annual_value DESC;

Error prevention

Defensive programming

-- ✅ GOOD: Prevent division by zero and handle edge cases
SELECT
    product_id,
    product_name,
    total_inventory,
    total_sold,
    -- Safe percentage calculation
    CASE
        WHEN total_inventory > 0 THEN
            ROUND((total_sold::decimal / total_inventory) * 100, 2)
        ELSE 0
    END AS sell_through_percentage,
    -- Safe average calculation
    CASE
        WHEN order_count > 0 THEN
            ROUND(total_revenue / order_count, 2)
        ELSE 0
    END AS average order value
FROM product_metrics;

Input validation

-- ✅ GOOD: Validate and clean inputs
SELECT
    customer_id,
    -- Clean phone numbers
    REGEXP_REPLACE(
        REGEXP_REPLACE(phone, '[^0-9]', '', 'g'), -- Remove non-digits
        '^1?(.{10})$',
        '\1'
    ) AS cleaned_phone,
    -- Validate email format
    CASE
        WHEN email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'
        THEN LOWER(TRIM(email))
        ELSE NULL
    END AS validated_email
FROM customers;

Testing & validation

Data quality checks

-- ✅ GOOD: Include data validation in your queries
SELECT
    'Customer Data Quality' AS report_type,
    COUNT(*) AS total_customers,
    COUNT(CASE WHEN email IS NOT NULL THEN 1 END) AS customers_with_email,
    COUNT(CASE WHEN phone IS NOT NULL THEN 1 END) AS customers_with_phone,
    COUNT(CASE WHEN registration_date > CURRENT_DATE THEN 1 END) AS future registration dates,
    AVG(CASE WHEN customer_name IS NOT NULL THEN LENGTH(customer_name) END) AS avg_name_length
FROM customers;

Common anti-patterns to avoid

What not to do

-- ❌ AVOID: Using functions unnecessarily
WHERE UPPER(status) = 'ACTIVE' -- Instead create index on status and use = 'ACTIVE'

-- ❌ AVOID: Complex nested subqueries when JOINs work better
SELECT * FROM customers WHERE customer_id IN (
    SELECT customer_id FROM orders WHERE order_date IN (
        SELECT MAX(order_date) FROM orders GROUP BY customer_id
    )
);

-- ❌ AVOID: Not handling NULLs in aggregations
AVG(rating) -- Might not behave as expected with NULLs

-- ❌ AVOID: Hardcoded values without context
WHERE status_code = 3 -- Use meaningful names: WHERE status = 'ACTIVE'

Summary checklist

Before deploying your query, ensure to check the following:

Column names are clear and descriptive
Functions have meaningful aliases
Code is properly formatted and indented
NULL values are explicitly handled
Data types are appropriate and consistent
Performance implications are considered
Edge cases (division by zero, empty sets) are handled
Comments explain complex logic
Query has been tested with sample data
Results are validated for accuracy

Performance review

Review the following before deploying the query:

WHERE clauses don't prevent index usage
Appropriate functions are used for data types
Aggregations are efficient
Complex calculations are broken into readable steps
Query execution plan has been reviewed

Overview​

What you can achieve?​

Mathematical functions​

ABS - Absolute Value​

Purpose:​

Usage:​

Example:​

Use Case:​

Mathematical functions​

CEIL - Round Up​

Purpose:​

Usage:​

Example:​

Use Case:​

FLOOR - Round Down​

Purpose:​

Usage:​

Example:​

Use Case:​

ROUND - Smart Rounding​

Purpose:​

Usage:​

Example:​

Use Case:​

POWER - Exponents​

Purpose:​

Usage:​

Example:​

Use Case:​

SQRT - Square Root​

Purpose:​

Usage:​

Example:​

Use Case:​

LN - Natural Logarithm​

Purpose:​

Usage:​

Example:​

Use Case:​

PI - Pi Constant​

Purpose:​

Usage:​

Example:​

Use Case:​

MOD - Remainder​

Purpose:​

Usage:​

Example:​

Use Case:​

Statistical functions​

AVG - Average​

Purpose:​

Usage:​

Example:​

Use Case:​

SUM - Total​

Purpose:​

Usage:​

Example:​

Use Case:​

COUNT - Count Items​

Purpose:​

Usage:​

Example:​

Use Case:​

MIN - Smallest Value​

Purpose:​

Usage:​

Example:​

Use Case:​

MAX - Largest Value​

Purpose:​

Usage:​

Example:​

Use Case:​

MEDIAN - Middle Value​

Purpose:​

Usage:​

Example:​

Use Case:​

Overview

What you can achieve?

Mathematical functions

ABS - Absolute Value

Purpose:

Usage:

Example:

Use Case:

Mathematical functions

CEIL - Round Up

Purpose:

Usage:

Example:

Use Case:

FLOOR - Round Down

Purpose:

Usage:

Example:

Use Case:

ROUND - Smart Rounding

Purpose:

Usage:

Example:

Use Case:

POWER - Exponents

Purpose:

Usage:

Example:

Use Case:

SQRT - Square Root

Purpose:

Usage:

Example:

Use Case:

LN - Natural Logarithm

Purpose:

Usage:

Example:

Use Case:

PI - Pi Constant

Purpose:

Usage:

Example:

Use Case:

MOD - Remainder

Purpose:

Usage:

Example:

Use Case:

Statistical functions

AVG - Average

Purpose:

Usage:

Example:

Use Case:

SUM - Total

Purpose:

Usage:

Example:

Use Case:

COUNT - Count Items

Purpose:

Usage:

Example:

Use Case:

MIN - Smallest Value

Purpose:

Usage:

Example:

Use Case:

MAX - Largest Value

Purpose:

Usage:

Example:

Use Case:

MEDIAN - Middle Value

Purpose:

Usage:

Example:

Use Case: