Docs Home
About TiDB Cloud
Get Started
Develop Applications
- Overview
- Quick Start
  - Build a TiDB Developer Cluster
  - CRUD SQL in TiDB
  - Build a Simple CRUD App with TiDB
    - Java
    - Golang
- Example Applications
  - Build a TiDB Application using Spring Boot
- Connect to TiDB
- Design Database Schema
- Write Data
- Read Data
- Transaction
- Optimize
  - Overview
  - SQL Performance Tuning
  - Best Practices for Performance Tuning
  - Best Practices for Indexing
  - Other Optimization Methods
    - Avoid Implicit Type Conversions
    - Unique Serial Number Generation
- Troubleshoot
- Reference
  - Bookshop Example Application
  - Guidelines
    - Object Naming Convention
    - SQL Development Specifications
- Cloud Native Development Environment
  - Gitpod
Manage Cluster
- Plan Your Cluster
- Create a TiDB Cluster
- Connect to Your TiDB Cluster
  - Connect via a SQL Client
  - Connect via SQL Shell
- Set Up VPC Peering Connections
- Use an HTAP Cluster with TiFlash
- Scale a TiDB Cluster
- Upgrade a TiDB Cluster
- Delete a TiDB Cluster
- Use TiDB Cloud API (Beta)
Migrate Data
Back Up and Restore
Monitor and Alert
- Overview
- Built-in Monitoring
- Built-in Alerting
- Third-Party Monitoring Integrations
  - Datadog Integration
  - Prometheus and Grafana Integration
Tune Performance
- Overview
- Analyze Performance
- SQL Tuning
  - Overview
  - Understanding the Query Execution Plan
  - SQL Optimization Process
    - Overview
    - Logic Optimization
    - Physical Optimization
    - Prepare Execution Plan Cache
  - Control Execution Plans
- TiKV Follower Read
- Coprocessor Cache
- Garbage Collection (GC)
  - Overview
  - Configuration
- Tune TiFlash performance
Manage User Access
- Manage Console User Access
- Configure Cluster Security Settings
Billing
Reference
FAQs
Release Notes
Support
Glossary

Unstable Result Set

This document describes how to solve unstable result set errors.

GROUP BY

For convenience, MySQL "extends" the GROUP BY syntax to allow the SELECT clause to refer to non-aggregated fields not declared in the GROUP BY clause, that is, the NON-FULL GROUP BY syntax. In other databases, this is considered a syntax ERROR because it causes unstable result sets.

For example, you have two tables:

stu_info stores the student information
stu_score stores the student test scores.

Then you can write a SQL query statement like this:

SELECT
    `a`.`class`,
    `a`.`stuname`,
    max( `b`.`courscore` )
FROM
    `stu_info` `a`
    JOIN `stu_score` `b` ON `a`.`stuno` = `b`.`stuno`
GROUP BY
    `a`.`class`,
    `a`.`stuname`
ORDER BY
    `a`.`class`,
    `a`.`stuname`;

Result:

+------------+--------------+------------------+
| class      | stuname      | max(b.courscore) |
+------------+--------------+------------------+
| 2018_CS_01 | MonkeyDLuffy |             95.5 |
| 2018_CS_03 | PatrickStar  |             99.0 |
| 2018_CS_03 | SpongeBob    |             95.0 |
+------------+--------------+------------------+
3 rows in set (0.00 sec)

The a.class and a.stuname fields are specified in the GROUP BY statement, and the selected columns are a.class, a.stuname and b.courscore. The only column that is not in the GROUP BY condition, b.courscore, is also specified with a unique value using the max() function. There is ONLY ONE result that satisfies this SQL statement without any ambiguity, which is called the FULL GROUP BY syntax.

A counterexample is the NON-FULL GROUP BY syntax. For example, in these two tables, write the following SQL query (delete a.stuname in GROUP BY).

SELECT
    `a`.`class`,
    `a`.`stuname`,
    max( `b`.`courscore` )
FROM
    `stu_info` `a`
    JOIN `stu_score` `b` ON `a`.`stuno` = `b`.`stuno`
GROUP BY
    `a`.`class`
ORDER BY
    `a`.`class`,
    `a`.`stuname`;

Then two values that match this SQL are returned.

The first returned value:

+------------+--------------+------------------------+
| class      | stuname      | max( `b`.`courscore` ) |
+------------+--------------+------------------------+
| 2018_CS_01 | MonkeyDLuffy |                   95.5 |
| 2018_CS_03 | PatrickStar  |                   99.0 |
+------------+--------------+------------------------+

The second returned value:

+------------+--------------+------------------+
| class      | stuname      | max(b.courscore) |
+------------+--------------+------------------+
| 2018_CS_01 | MonkeyDLuffy |             95.5 |
| 2018_CS_03 | SpongeBob    |             99.0 |
+------------+--------------+------------------+

There are two results because you did NOT specify how to get the value of the a.stuname field in SQL, and two results are both satisfied by SQL semantics. It results in an unstable result set. Therefore, if you want to guarantee the stability of the result set of the GROUP BY statement, use the FULL GROUP BY syntax.

MySQL provides a sql_mode switch ONLY_FULL_GROUP_BY to control whether to check the FULL GROUP BY syntax or not. TiDB is also compatible with this sql_mode switch.

mysql> select a.class, a.stuname, max(b.courscore) from stu_info a join stu_score b on a.stuno=b.stuno group by a.class order by a.class, a.stuname;
+------------+--------------+------------------+
| class      | stuname      | max(b.courscore) |
+------------+--------------+------------------+
| 2018_CS_01 | MonkeyDLuffy |             95.5 |
| 2018_CS_03 | PatrickStar  |             99.0 |
+------------+--------------+------------------+
2 rows in set (0.01 sec)

mysql> set @@sql_mode='STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY';
Query OK, 0 rows affected (0.01 sec)

mysql> select a.class, a.stuname, max(b.courscore) from stu_info a join stu_score b on a.stuno=b.stuno group by a.class order by a.class, a.stuname;
ERROR 1055 (42000): Expression #2 of ORDER BY is not in GROUP BY clause and contains nonaggregated column '' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

Run results: The above example shows the effect when you set ONLY_FULL_GROUP_BY for sql_mode.

ORDER BY

In SQL semantics, the result set is output in order only if the ORDER BY syntax is used. For a single-instance database, since the data is stored on one server, the results of multiple executions are often stable without data reorganization. Some databases (especially the MySQL InnoDB storage engine) can even output the result sets in order of the primary key or index.

As a distributed database, TiDB stores data on multiple servers. In addition, the TiDB layer does not cache data pages, so the result set order of SQL statements without ORDER BY is easily perceived as unstable. To output a sequential result set, you need to explicitly add the order field into the ORDER BY clause, which conforms to SQL semantics.

In the following example, only one field is added to the ORDER BY clause, and TiDB only sorts the results by that one field.

mysql> select a.class, a.stuname, b.course, b.courscore from stu_info a join stu_score b on a.stuno=b.stuno order by a.class;
+------------+--------------+-------------------------+-----------+
| class      | stuname      | course                  | courscore |
+------------+--------------+-------------------------+-----------+
| 2018_CS_01 | MonkeyDLuffy | PrinciplesofDatabase    |      60.5 |
| 2018_CS_01 | MonkeyDLuffy | English                 |      43.0 |
| 2018_CS_01 | MonkeyDLuffy | OpSwimming              |      67.0 |
| 2018_CS_01 | MonkeyDLuffy | OpFencing               |      76.0 |
| 2018_CS_01 | MonkeyDLuffy | FundamentalsofCompiling |      88.0 |
| 2018_CS_01 | MonkeyDLuffy | OperatingSystem         |      90.5 |
| 2018_CS_01 | MonkeyDLuffy | PrincipleofStatistics   |      69.0 |
| 2018_CS_01 | MonkeyDLuffy | ProbabilityTheory       |      76.0 |
| 2018_CS_01 | MonkeyDLuffy | Physics                 |      63.5 |
| 2018_CS_01 | MonkeyDLuffy | AdvancedMathematics     |      95.5 |
| 2018_CS_01 | MonkeyDLuffy | LinearAlgebra           |      92.5 |
| 2018_CS_01 | MonkeyDLuffy | DiscreteMathematics     |      89.0 |
| 2018_CS_03 | SpongeBob    | PrinciplesofDatabase    |      88.0 |
| 2018_CS_03 | SpongeBob    | English                 |      79.0 |
| 2018_CS_03 | SpongeBob    | OpBasketball            |      92.0 |
| 2018_CS_03 | SpongeBob    | OpTennis                |      94.0 |
| 2018_CS_03 | PatrickStar  | LinearAlgebra           |       6.5 |
| 2018_CS_03 | PatrickStar  | AdvancedMathematics     |       5.0 |
| 2018_CS_03 | SpongeBob    | DiscreteMathematics     |      72.0 |
| 2018_CS_03 | PatrickStar  | ProbabilityTheory       |      12.0 |
| 2018_CS_03 | PatrickStar  | PrincipleofStatistics   |      20.0 |
| 2018_CS_03 | PatrickStar  | OperatingSystem         |      36.0 |
| 2018_CS_03 | PatrickStar  | FundamentalsofCompiling |       2.0 |
| 2018_CS_03 | PatrickStar  | DiscreteMathematics     |      14.0 |
| 2018_CS_03 | PatrickStar  | PrinciplesofDatabase    |       9.0 |
| 2018_CS_03 | PatrickStar  | English                 |      60.0 |
| 2018_CS_03 | PatrickStar  | OpTableTennis           |      12.0 |
| 2018_CS_03 | PatrickStar  | OpPiano                 |      99.0 |
| 2018_CS_03 | SpongeBob    | FundamentalsofCompiling |      43.0 |
| 2018_CS_03 | SpongeBob    | OperatingSystem         |      95.0 |
| 2018_CS_03 | SpongeBob    | PrincipleofStatistics   |      90.0 |
| 2018_CS_03 | SpongeBob    | ProbabilityTheory       |      87.0 |
| 2018_CS_03 | SpongeBob    | Physics                 |      65.0 |
| 2018_CS_03 | SpongeBob    | AdvancedMathematics     |      55.0 |
| 2018_CS_03 | SpongeBob    | LinearAlgebra           |      60.5 |
| 2018_CS_03 | PatrickStar  | Physics                 |       6.0 |
+------------+--------------+-------------------------+-----------+
36 rows in set (0.01 sec)

Results are unstable when the ORDER BY values are the same. To reduce randomness, ORDER BY values should be unique. If you can't guarantee the uniqueness, you need to add more ORDER BY fields until the combination of the ORDER BY fields in ORDER BY is unique, then the result will be stable.

The result set is unstable because order by is not used in `GROUP_CONCAT()`

The result set is unstable because TiDB reads data from the storage layer in parallel, so the result set order returned by GROUP_CONCAT() without ORDER BY is easily perceived as unstable.

To let GROUP_CONCAT() get the result set output in order, you need to add the sorting fields to the ORDER BY clause, which conforms to the SQL semantics. In the following example, GROUP_CONCAT() that splices customer_id without ORDER BY causes an unstable result set.

Excluded ORDER BY

First query:

mysql>  select GROUP_CONCAT( customer_id SEPARATOR ',' ) FROM customer where customer_id like '200002%';
+-------------------------------------------------------------------------+
| GROUP_CONCAT(customer_id  SEPARATOR ',')                                |
+-------------------------------------------------------------------------+
| 20000200992,20000200993,20000200994,20000200995,20000200996,20000200... |
+-------------------------------------------------------------------------+

Second query:

mysql>  select GROUP_CONCAT( customer_id SEPARATOR ',' ) FROM customer where customer_id like '200002%';
+-------------------------------------------------------------------------+
| GROUP_CONCAT(customer_id  SEPARATOR ',')                                |
+-------------------------------------------------------------------------+
| 20000203040,20000203041,20000203042,20000203043,20000203044,20000203... |
+-------------------------------------------------------------------------+

Include ORDER BY

First query:

mysql>  select GROUP_CONCAT( customer_id order by customer_id SEPARATOR ',' ) FROM customer where customer_id like '200002%';
+-------------------------------------------------------------------------+
| GROUP_CONCAT(customer_id  SEPARATOR ',')                                |
+-------------------------------------------------------------------------+
| 20000200000,20000200001,20000200002,20000200003,20000200004,20000200... |
+-------------------------------------------------------------------------+

Second query:

mysql>  select GROUP_CONCAT( customer_id order by customer_id SEPARATOR ',' ) FROM customer where customer_id like '200002%';
+-------------------------------------------------------------------------+
| GROUP_CONCAT(customer_id  SEPARATOR ',')                                |
+-------------------------------------------------------------------------+
| 20000200000,20000200001,20000200002,20000200003,20000200004,20000200... |
+-------------------------------------------------------------------------+

Unstable results in `SELECT * FROM T LIMIT N`

The returned result is related to the distribution of data on the storage node (TiKV). If multiple queries are performed, different storage units (Regions) of the storage nodes (TiKV) return results at different speeds, which can cause unstable results.

Download PDF Request docs changes

What’s on this page

GROUP BY
ORDER BY
The result set is unstable because order by is not used in GROUP_CONCAT()
Unstable results in SELECT * FROM T LIMIT N

Was this page helpful?

Unstable Result Set

GROUP BY

ORDER BY

The result set is unstable because order by is not used in GROUP_CONCAT()

Unstable results in SELECT * FROM T LIMIT N

The result set is unstable because order by is not used in `GROUP_CONCAT()`

Unstable results in `SELECT * FROM T LIMIT N`