Database

Comic decision tree for NoSQL/RDBMS

Transaction principles (ACID)

  1. Atomicity — операция либо целиком выполняется полностью изменяя данные, либо отменяется и не оставляет за собой следов.
  2. Consistency (integrity) — после того как приложение выполняет операцию ее результат становится доступным для всех последующих операций.
  3. Isolation — операции одних пользователей не имеют побочных эффектов на других пользователей.
  4. Durability — изменения сделанные в результате успешной опрации сохраняют результат даже в случае системного сбоя.

Transaction isolation levels

The isolation level measures concurrent transactions' capacity to view data that have been updated, but not yet committed, by another transaction. If other transactions were allowed to read data that are as-yet uncommitted, those transactions could end up with inconsistent data were the transaction to roll back, or end up waiting unnecessarily were the transaction to commit successfully.

Depending on isolation level the following phenomena are permissible1):

A higher isolation level means less concurrence and a greater likelihood of performance bottlenecks, but also a decreased chance of reading inconsistent data. A good rule of thumb is to use the highest isolation level that yields an acceptable performance level. The following are common isolation levels, arranged from lowest to highest:

Feature comparison

Title MySQL HSQL DB2 MSSQL
Show query execution plan EXPLAIN SELECT ... EXPLAIN PLAN FOR SELECT ... EXPLAIN SELECT ... SET SHOWPLAN_ALL ON;
SELECT …;
SET SHOWPLAN_ALL OFF;
2)
Number of characters that can actually be stored in VARCHAR column

MySQL interprets length specifications in character column definitions in character units (before MySQL 4.1 column lengths were interpreted in bytes). This applies to CHAR, VARCHAR, and TEXT types. Thus VARCHAR column that uses the UTF-8 character set can be declared to be a maximum of 21,844 characters as up to three bytes can be required per character3).

The capacity of VARCHAR is in characters. The default for unit is CHARACTERS thus declaration A VARCHAR(20) is equivalent to A VARCHAR(20 CHARACTERS). The characters are stored in UTF-16.4)

VARCHAR(n): Varying-length character strings with a maximum length of n bytes. n must be greater than 0 and less than a number that depends on the page size of the table space. The maximum length is 327045).

Variable-length Unicode UCS-2 string data. n defines the string length and can be a value from 1 through 4,0006).

Get last generated primary key SELECT LAST_INSERT_ID() CALL IDENTITY() VALUES IDENTITY_VAL_LOCAL() SELECT @@IDENTITY
Limit the number of result rows of SELECT statement to X rows SELECT … LIMIT X SELECT … LIMIT X SELECT … FETCH FIRST X ROWS ONLY SELECT … OFFSET 0 FETCH FIRST X ROWS ONLY
ON UPDATE CASCADE for foreign keys :YES: Using FOREIGN KEY Constraints :YES: Referential constraint definition :NO: FOREIGN KEY rule-clause :YES: Cascading Referential Integrity Constraints
GROUP BY ROLLUP syntax GROUP BY col1, col2, … WITH ROLLUP7) :NO: (check forum). As workaround use UNION of several SELECT statements. GROUP BY ROLLUP(col1, col2, …)8) GROUP BY ROLLUP(col1, col2, …)
GROUP BY CUBE syntax :NO: :NO: GROUP BY CUBE(col1, col2, …) GROUP BY CUBE(col1, col2, …)
Calculate difference between two dates (date2 - date1) in seconds TIMESTAMPDIFF(second, date1, date2) DATEDIFF('second', date1, date2) TIMESTAMPDIFF(2, TIMESTAMP(date2) - TIMESTAMP(date1))9) DATEDIFF(second, date1, date2)
Convert date to UNIX/POSIX timestamp UNIX_TIMESTAMP(date) UNIX_TIMESTAMP(date) CAST (DAYS(date) - DAYS('1970-01-01') AS INTEGER) * 86400 + MIDNIGHT_SECONDS(date) DATEDIFF(second, '1970-01-01', date)
Validation query (see also here) SELECT 1 SELECT 1 FROM information_schema.system_users VALUES 1 SELECT 1

IBM DB2

DB2 drawbacks:

How to reorganize a table (REORG table)?

After one has altered the table, it needs to be reorganized. To do so you need a DB2 Run-Time Client software, that includes DB2 CLI processor. After you start this processor (50918 is the port number):
db2 => CATALOG TCPIP NODE that_node REMOTE db2_hostname SERVER 50918
DB20000I  The CATALOG TCPIP NODE command completed successfully.
db2 => CATALOG DATABASE remote_db AS db_alias AT NODE that_node
DB20000I  The CATALOG DATABASE command completed successfully.
db2 => CONNECT TO db_alias USER my_user USING ****
Database Connection Information
...
db2 => REORG TABLE db_alias.table_name
DB20000I  The REORG command completed successfully.

How to add an identity column (primary key) to an existing table?

The examples given here and here work for DB2 v9.x with slight change, which is mentioned here:
ALTER TABLE TABLE_NAME ADD COLUMN id INTEGER NOT NULL WITH DEFAULT 0
ALTER TABLE TABLE_NAME ALTER COLUMN id DROP DEFAULT
ALTER TABLE TABLE_NAME ALTER COLUMN id SET GENERATED ALWAYS AS IDENTITY
REORG TABLE TABLE_NAME
UPDATE TABLE_NAME SET id = DEFAULT
ALTER TABLE TABLE_NAME ALTER COLUMN id SET PRIMARY KEY

It might happen that something will go wrong with above. Then consider the alternative of creating a new table plus copying all data from old table to new one.

How to rename a table?

From one perspective you may find it pretty straightforward:
RENAME TABLE TABLE_NAME TO new_table_name

however if table has a foreign key constraint or another table refers it, this command will fail. So you need to drop these constrains first. The list of them you can find in a table SYSCAT.REFERENCES:

SELECT * FROM syscat.references WHERE tabname = 'table_name'
ALTER TABLE TABLE_NAME DROP CONSTRAINT table_name_fk

How concatenate INTEGER and VARCHAR ?

The CONCAT() function fails to concatenate string and integer. Something like SELECT text_column || ' (' || number_column || ')' FROM … results the following error:
DB2 SQL Error: SQLCODE=-440, SQLSTATE=42884, SQLERRMC=||;FUNCTION
No routine was found with the specified name and compatible arguments.

One need to cast from number to string before using CONCAT(). The normal casting like CAST(number_column AS CHAR(10)) or CAST(number_column AS VARCHAR(1024)) do not work. The solution is:
SELECT text_column || ' (' || RTRIM(CHAR(number_column)) || ')' FROM ...

Another solution:

SELECT VARCHAR(CAST(number_column AS CHAR)) || text_column FROM TABLE

On MSSQL:

SELECT CAST(number_column AS VARCHAR) + text_column FROM TABLE

How to learn what SQL statement and what connection is holding a lock on the table (which causes SQLSTATE=40001 error)? How to select from the table in non-blocking mode?

To list all locks on the DB per application:
db2 => CONNECT TO db_alias USER my_user USING ****
Database Connection Information
...
db2 => ATTACH TO that_node USER my_user USING ****
Instance Attachment Information
...
db2 => GET SNAPSHOT FOR LOCKS ON db_alias
Database Lock Snapshot
...
List Of Locks
 Lock Name                   = 0x02000D000401B0110000000052
 Lock Attributes             = 0x00000008
 Release Flags               = 0x40000000
 Lock Count                  = 1
 Hold Count                  = 0
 Lock Object Name            = 296747268
 Object Type                 = Row
 Tablespace Name             = USERSPACE1
 Table Schema                = DB_ALIAS
 Table Name                  = TABLE_NAME
 Mode                        = X

See also:

How to avoid deadlocks?

The application attempts to write into the same table from parallel threads. It causes the DB to fail with the following exception:

com.ibm.db2.jcc.a.pn: The current transaction has been rolled back because of a deadlock or timeout. Reason code "2".. SQLCODE=-911, SQLSTATE=40001, DRIVER=3.52.95

Also posted as Parallel inserts cause deadlock on DB2.

The problem is that lock space is exhausted much earlier then transaction log is exhausted. Solutions:
  • Increase lock list space or set it to AUTOMATIC:
    db2 "update db cfg using LOCKLIST 3200 AUTOMATIC"
    db2 "update db cfg using MAXLOCKS 60 AUTOMATIC"
  • Commit more frequently.

See also:

  • locklists – lock list maximum storage configuration parameter
  • maxlocks – maximum percent of lock list before escalation configuration parameter

Is there any analogue of MySQL GROUP_CONCAT() function in DB2?

Unfortunately, the only aggregate function that is available in DB2 is XMLAGG() (taken from Create a delimited string from a query in DB2).
SELECT
    group_column,
    SUBSTR(XMLSERIALIZE(XMLAGG(XMLTEXT(CONCAT(', ', concat_column))) AS VARCHAR(1024)), 3)
FROM TABLE_NAME
GROUP BY group_column

How enable logging for DB2 JDBC driver?

Use the following URL parameters: jdbc:db2://db2server:50012/MYDB:traceFile=c:/jdbc.log;TraceLevel=TRACE_ALL;

How to make SQL error codes resolved to text messages?

From Friendly Error Message for DB2 and Properties for the IBM DB2 Driver for JDBC and SQLJ:
  • Enable retrieveMessagesFromServerOnGetMessage JDBC property: jdbc:db2://db2server:50012/MYDB:currentSchema=MYSCHEMA;retrieveMessagesFromServerOnGetMessage=true;
  • Create SYSIBM.SQLCAMESSAGECCSID stored procedure.

Afterwards the SQLException.getMessage() will resolve the numeric code to text message, for example: "MYSCHEMA.DUMMY_TABLE" is an undefined name.. SQLCODE=-204, SQLSTATE=42704, DRIVER=3.52.95

If the error message is still not resolved, and driver trace shows that call to SQLCAMESSAGECCSID results SQLCODE=-1042 SQLSTATE=58004 error, check that DB2_FMP_COMM_HEAPSZ10) registry variable is not set to 0. In this case either remove this variable to use the default value. Check here for more information.

How to specify the transaction isolation level for DB2 JDBC driver?

It can be specified in JDBC URL using defaultIsolationLevel parameter. The value is one from java.sql.Connection constants:
<bean id="dbDataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
    <property name="driverClassName" value="com.ibm.db2.jcc.DB2Driver" />
    <property name="url" value="jdbc:db2://db2server:50012/my_db:currentSchema=MY_SCHEMA;defaultIsolationLevel=2;" />
    <property name="validationQuery" value="values 1" />
    <property name="username" value="my_user" />
    <property name="password" value="my_pass" />
</bean>

Where can I download latest JDBC drivers?

You can download driver v3.63.123 from here (registration is required).

MS SQL

How to view queries that are blocked?

SELECT
    tl.request_session_id [waitingSeesionID],
    wt.blocking_session_id [blockingSessionID],
    wt.resource_description,
    wt.wait_type,
    wt.wait_duration_ms,
    db_name (tl.resource_database_id) [DB],
    tl.resource_associated_entity_id [WaitingAssEntity],
    tl.resource_type,
    wrt.[text] [Waiting_TSQL],
    btl.request_type [BlockingReqType],
    brt.[text] [Blocking_TSQL]
FROM sys.dm_tran_locks tl
JOIN sys.dm_os_waiting_tasks wt ON tl.lock_owner_address = wt.resource_address
JOIN sys.dm_exec_requests wr ON wr.session_id = tl.request_session_id
CROSS APPLY sys.dm_exec_sql_text(wr.sql_handle) AS wrt
LEFT JOIN sys.dm_exec_requests br ON br.session_id = wt.blocking_session_id
OUTER APPLY sys.dm_exec_sql_text(br.sql_handle) AS brt
LEFT JOIN sys.dm_tran_locks AS btl ON br.session_id = btl.request_session_id

How to avoid deadlocks?

  • Modify non-clustered index locking behaviour.
    Page level locking is often associated with non-clustered indexes. This can create a problem because page locks prevent rows being accessed on the page by other transactions. Thus non clustered index usage may introduce extra locking overhead on a table. To get around this and reduce the index locking overhead you can direct the index to lock at row level.
    ALTER INDEX myindex ON mytable SET ( ALLOW_ROW_LOCKS = ON );

    Or more extreme:

    ALTER INDEX myindex ON mytable SET ( ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF );
  • Disable table page lock escalation (also see Lock Escalation):
    ALTER TABLE my_table SET ( LOCK_ESCALATION = DISABLE );
  • Change transaction isolation level to READ UNCOMIITTED.

Are DDL statements transactional?

From How to use transactions with SQL Server DDL and Applying transaction for DDL statements:

Pretty much everything that happens within the scope of an individual database is part of the transaction and can be rolled back, including DDL.

How to connect to MSSQL server from Perl?

First option is to install FreeTDS:
  • Install libiodbc-dev.
  • Install krb5-dev / libkrb5-dev and (comerr-dev / libcom_err-dev).
  • Download FreeTDS tarball, unpack it.
  • In case you have compiled above libraries from sources run:
    CPPFLAGS=-I$HOME/include ./configure –prefix=$HOME –with-iodbc=$HOME –enable-krb5=gssapi_krb5
  • Then make all install
  • Install DBI:
    perl -MCPAN -Mlocal::lib -I~/perl5 -e "install DBI"
  • Install DBD::Sybase:
    SYBASE=$HOME perl -MCPAN -I~/perl5 -Mlocal::lib -e 'notest("install", "DBD::Sybase")'

Issues:

  • FreeTDS supports passwords no longer then 30 characters (check here and there, bug#706388). Apply this patch:

    ./include/tds.h

    --- ./include/tds.h     2013-04-26 11:17:07.000000000 +0200
    +++ ./include/tds.h.orig        2012-07-17 10:16:16.000000000 +0200
    @@ -499,7 +499,7 @@
    
     #define TDS_ALIGN_SIZE sizeof(tds_align_struct)
    
    -#define TDS_MAX_LOGIN_STR_SZ 50
    +#define TDS_MAX_LOGIN_STR_SZ 30
     typedef struct tds_login
     {
            DSTR server_name;               /**< server name (in freetds.conf) */
  • DBD::Sybase supports passwords no longer then 32 characters (check bug#706392). Apply this patch:

    dbdimp.c

    --- dbdimp.h.orig    2013-04-03 22:16:20.000000000 +0200
    +++ dbdimp.h    2013-04-29 17:24:21.000000000 +0200
    @@ -51,6 +51,9 @@
    
     #define MAX_SQL_SIZE 255
    
    +/* The same as TDS_MAX_LOGIN_STR_SZ */
    +#define MAX_LOGIN_STR_SZ 50
    +
     /* Define dbh implementor data structure */
     struct imp_dbh_st {
         dbih_dbc_t com; /* MUST be first element in structure    */
    @@ -71,8 +74,8 @@
         int lasterr;
         int lastsev;
    
    -    char uid[32];
    -    char pwd[32];
    +    char uid[MAX_LOGIN_STR_SZ];
    +    char pwd[MAX_LOGIN_STR_SZ];
    
         char server[64];
         char charset[64];
    --- dbdimp.c.orig    2013-04-03 22:16:20.000000000 +0200
    +++ dbdimp.c    2013-04-29 17:24:56.000000000 +0200
    @@ -1147,10 +1147,10 @@
             imp_dbh->server[63] = 0;
         }
    
    -    strncpy(imp_dbh->uid, uid, 32);
    -    imp_dbh->uid[31] = 0;
    -    strncpy(imp_dbh->pwd, pwd, 32);
    -    imp_dbh->pwd[31] = 0;
    +    strncpy(imp_dbh->uid, uid, MAX_LOGIN_STR_SZ);
    +    imp_dbh->uid[MAX_LOGIN_STR_SZ - 1] = 0;
    +    strncpy(imp_dbh->pwd, pwd, MAX_LOGIN_STR_SZ);
    +    imp_dbh->pwd[MAX_LOGIN_STR_SZ - 1] = 0;
    
         sv_setpv(DBIc_ERRSTR(imp_dbh), "");

Second option is Easysoft ODBC driver (example in How do I connect to a MSSQL database using Perl's DBI module?).

MySQL

How rename a database?

RENAME DATABASE was added in MySQL 5.1.7 but was found to be dangerous and was removed in MySQL 5.1.23. An alternative way is:
for table in `mysql -uroot -s -N -e "show tables from old_db"`; do mysql -uroot -s -N -e "rename table old_db.$table to new_db.$table;"; done

however it does not work if any of tables has a trigger (results Trigger in wrong schema error). In such case use more universal but more slow approach:

mysqldump old_db | mysql -uroot new_db
mysql -uroot -e 'drop database old_db'

mysqldump: Error 2013: Lost connection to MySQL server during query when dumping table ...

Modify /etc/mysql/mysql.conf.d/mysqld.cnf so that it includes:
[mysqld]
net_read_timeout=300
net_write_timeout=300

and restart mysql (/etc/init.d/mysql restart). There is a request to allow to set these properties for mysqldump for a particular session but still not implemented, hence can only be set globally.

How to change default pager for mysql client?

Big Data

Apache Lucene / Elasticsearch:

JDBC datasource Spring configuration examples

Pooled MySQL datasource

<bean id="dataSource" class="com.mchange.v2.c3p0.ComboPooledDataSource" destroy-method="close"
    p:driverClass="com.mysql.jdbc.Driver"
    p:jdbcUrl="jdbc:mysql://host:3306/my_db?autoReconnect=true"
    p:user="john"
    p:password="john_pass"
    p:maxPoolSize="20"
    p:maxStatements="50"
    p:testConnectionOnCheckin="true"
    p:idleConnectionTestPeriod="7200"
    p:preferredTestQuery="SELECT 1"
/>

Pooled HSQL datasource

<bean id="dataSource" class="org.apache.tomcat.jdbc.pool.DataSource" destroy-method="close"
    p:driverClassName="org.hsqldb.jdbcDriver"
    p:url="jdbc:hsqldb:mem:test"
    p:username="sa"
    p:password=""
    p:maxActive="20"
    p:testOnReturn="true"
    p:testWhileIdle="true"
    p:validationInterval="1800000"
    p:validationQuery="SELECT 1 FROM information_schema.tables"
/>

Embedded HSQL server

<?xml version="1.0" encoding="UTF-8"?>
<beans
    xmlns="http://www.springframework.org/schema/beans"
    xmlns:jdbc="http://www.springframework.org/schema/jdbc"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="
        http://www.springframework.org/schema/beans        http://www.springframework.org/schema/beans/spring-beans-3.1.xsd
        http://www.springframework.org/schema/jdbc        http://www.springframework.org/schema/jdbc/spring-jdbc-3.1.xsd">
 
    <jdbc:embedded-database id="dataSource" />
</beans>

JNDI source

<?xml version="1.0" encoding="UTF-8"?>
<beans
    xmlns="http://www.springframework.org/schema/beans"
    xmlns:jee="http://www.springframework.org/schema/jee"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="
        http://www.springframework.org/schema/beans        http://www.springframework.org/schema/beans/spring-beans-3.1.xsd
        http://www.springframework.org/schema/jee        http://www.springframework.org/schema/jee/spring-jee-3.1.xsd">
 
    <jee:jndi-lookup id="dataSource" jndi-name="java:jdbc/myDS" />
</beans>
1) From chapter 5 “Transactions, concurrency, and caching” of Hibernate In Action by Christian Bauer and Gavin King [Manning, 2005]
7) Starting from MySQL v4.1.1 released in Dec 2003
8) Available on DB2 9.x, released in Jul 2006
9) Implementation is not correct as does not treat leap years
10) The amount of shared memory (in 4KB pages) that DB2 tries to allocate at startup time for fenced routine communication