Databases as a Java developer

Databases as a Java developer

JPA (Java Persistence API) Basics

  • JPA is a specification for ORM (Object-Relational Mapping) in Java.

  • Common implementations: Hibernate, EclipseLink, OpenJPA.

  • Core annotations:

    • @Entity – Marks a class as a database entity.

    • @Id – Specifies the primary key.

    • @GeneratedValue – Configures ID generation strategy.

    • @OneToOne, @OneToMany, @ManyToOne, @ManyToMany – Define relationships.

Transactions, Isolation, and Propagation

  • Transactions ensure that database operations are atomic.

  • Isolation Levels control how transactions affect each other:

    • READ UNCOMMITTED – Allows dirty reads.

    • READ COMMITTED – Prevents dirty reads but allows non-repeatable reads.

    • REPEATABLE READ – Prevents dirty and non-repeatable reads but allows phantom reads.

    • SERIALIZABLE – The strictest, avoiding all concurrency issues.

  • Propagation determines how transactions behave when calling another transactional method:

    • REQUIRED – Uses the current transaction or creates a new one.

    • REQUIRES_NEW – Always starts a new transaction.

    • NESTED – Creates a nested transaction inside the current one.

    • SUPPORTS, NOT_SUPPORTED, MANDATORY, NEVER – Other variations.

  • @Transactional in Spring manages transactions at the service layer.

Joins and How to Create Them

  • INNER JOIN – Returns matching rows from both tables.

  • LEFT JOIN – Returns all rows from the left table and matching ones from the right.

  • RIGHT JOIN – Returns all rows from the right table and matching ones from the left.

  • FULL JOIN – Returns all rows when there is a match in either table.

  • Example:

      SELECT e.name, d.department_name
      FROM employees e
      INNER JOIN departments d ON e.department_id = d.id;
    

Indexes and Their Usage

  • Indexes speed up queries by allowing quick lookups.

  • Types of Indexes:

    • B-Tree Index – Most common, balanced search tree structure.

    • Hash Index – Used for equality lookups.

    • Unique Index – Enforces uniqueness of values.

    • Composite Index – Multi-column index.

    • Full-Text Index – For text search.

  • When to use indexes?

    • On primary keys and foreign keys.

    • On frequently searched columns.

    • Avoid indexing small tables or frequently updated columns.

  • Example of creating an index:

      CREATE INDEX idx_employee_name ON employees(name);
    

Dirty Reads

A dirty read occurs when a transaction reads data that has been modified by another transaction but not yet committed. If the modifying transaction rolls back, the read data becomes invalid (hence, "dirty").

Example of a Dirty Read

Suppose we have an accounts table:

CREATE TABLE accounts (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    balance DECIMAL(10,2)
);
  • Transaction 1 updates a balance but does not commit yet:

      BEGIN TRANSACTION;
      UPDATE accounts SET balance = 5000 WHERE id = 1;
      -- No COMMIT or ROLLBACK yet
    
  • Transaction 2 reads the uncommitted balance:

      SELECT balance FROM accounts WHERE id = 1;
    
  • If Transaction 1 rolls back:

      ROLLBACK;
    

    Transaction 2’s read was incorrect because the balance was never actually updated.

Prevention: Set isolation level to READ COMMITTED or higher.

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

More SQL Join Examples

1. INNER JOIN – Fetch Employees with Department Names

SELECT e.name AS employee_name, d.name AS department_name
FROM employees e
INNER JOIN departments d ON e.department_id = d.id;

📌 Use case: When you only want records that exist in both tables.


2. LEFT JOIN – Employees with or Without Departments

SELECT e.name AS employee_name, d.name AS department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.id;

📌 Use case: If you want to include employees even if they don’t belong to a department.


3. RIGHT JOIN – Departments with or Without Employees

SELECT d.name AS department_name, e.name AS employee_name
FROM departments d
RIGHT JOIN employees e ON e.department_id = d.id;

📌 Use case: If you want to include departments even if no employees are assigned to them.


4. FULL OUTER JOIN – All Employees and Departments

SELECT e.name AS employee_name, d.name AS department_name
FROM employees e
FULL OUTER JOIN departments d ON e.department_id = d.id;

📌 Use case: When you want to see all employees and all departments, even if there are unmatched records.


5. Self JOIN – Find Employees with the Same Manager

SELECT e1.name AS employee, e2.name AS manager
FROM employees e1
JOIN employees e2 ON e1.manager_id = e2.id;

📌 Use case: Useful for hierarchical relationships (e.g., org structures).


Real-Life Scenarios for Choosing Indexes

  1. Primary Keys and Unique Constraints

    • Every primary key has an index automatically.

    • Example:

        CREATE UNIQUE INDEX idx_email ON users(email);
      
    • 📌 Use case: Ensuring unique emails in a user system.

  2. Foreign Key Indexes

    • Foreign keys should be indexed for faster lookups.

    • Example:

        CREATE INDEX idx_orders_customer ON orders(customer_id);
      
    • 📌 Use case: Speeding up joins between orders and customers.

  3. Indexes on Frequently Queried Columns

    • If a column is often used in WHERE conditions, index it.

    • Example:

        CREATE INDEX idx_product_price ON products(price);
      
    • 📌 Use case: Optimizing range queries like SELECT * FROM products WHERE price BETWEEN 10 AND 100;

  4. Composite Indexes for Multi-Column Searches

    • When multiple columns are frequently searched together.

    • Example:

        CREATE INDEX idx_name_dob ON users(last_name, date_of_birth);
      
    • 📌 Use case: Searching users by both last name and date of birth.

  5. Full-Text Indexes for Search

    • When searching in large text fields.

    • Example:

        CREATE FULLTEXT INDEX idx_articles_content ON articles(content);
      
    • 📌 Use case: Searching articles with keyword-based search.

  6. Avoid Over-Indexing!

    • Too many indexes slow down INSERT, UPDATE, and DELETE.

    • Use indexing only where necessary.