How does mysql order rows with the same value? - mysql

In my database I have some records where I am sorting by a column that contains identical values:
| col1 | timestamp |
| row1 | 2011-07-01 00:00:00 |
| row2 | 2011-07-01 00:00:00 |
| row3 | 2011-07-01 00:00:00 |
SELECT ... ORDER BY timestamp
It looks like the result is in random order.
Is the random order consistent? I have these data in two mysql servers can I expect the same result?

I'd advise against making that assumption. In standard SQL, anything not required by an explicit ORDER BY clause is implementation dependent.
I can't speak for MySQL, but on e.g. SQL Server, the output order for rows that are "equal" so far as the ORDER BY is concerned may vary every time the query is run - and could be influenced by practically anything (e.g. patch/service pack level of the server, workload, which pages are currently in the buffer pool, etc).
So if you need a specific order, the best thing you can do (both to guarantee it, and to document your query for future maintainers) is explicitly request the ordering you want.

Lot's of answers already, but the bottom line answer is NO.
If you want rows returned in a particular sequence, consistently, then specify that in an ORDER BY. Without that, there absolutely NO GUARANTEE what order rows will be returned in.
I think what you may be missing is that there can be multiple expressions listed in the ORDER BY clause. And you can include expressions that are not in the SELECT list.
In your case, for example, you could use ORDER BY timestamp, id.
(Or some other columns or expressions.)
That will order the rows first on timestamp, and then any rows that have the same value for timestamp will be ordered by id, or whatever the next expression in this list is.

The answer is: No, the order won't be consistent. I faced the same issue and solved it by adding another column to the order section. Be sure that this column is unique for each record like 'ID' or whatever it is.
In this case, you must add the 'ID' field to your table which is unique for each record. You can assign it 'AI' (auto increment) so that you are not going to deal with the maintenance.
After adding the 'ID' column, update the last part of your query like:
SELECT mt.*
FROM my_table mt
ORDER BY mt.timestamp ASC, mt.id DESC

In ORDER BY condition if the rows are same values or if you want to arrange the data by selecting ORDER BY statement. CASE : You want to ORDER BY the values of column are frequency of words. And two words in the table may have the same frequency value in the frequency occurrence column.. So in the frequency column you will have two same frequencies of two different words. So, in "select * from database_name ORDER BY frequency" you may find any of one the two words having the same frequency showing up just before its latter. And in second run the other word which was showing after the first word showing up earlier now. It depends on buffer memory,pages being in and out at the moment etc..

That depends on storage engine used. In MyISAM they'll be ordered in natural order (i.e. in order they're stored on the disk - which can be changed using ALTER TABLE ... ORDER BY command). In InnoDB they'll be ordered by PK. Other engines can have their own rules.

Related

MySQL select query giving wrong sequences of rows from primary key

I am running this MySQL select query:
select * from ABC where column_value=1;
I expect to get output like this:
ID Name
1 AAA
2 BBB
3 CCC
But instead I am getting this:
ID Name
2 BBB
1 AAA
3 CCC
Can anyone give me an idea why MySQL is behaving like this?
Databases tend to use the fastest way to read data from tables. This means it may return data in any order if it finds it faster, unless you use an ORDER BY clause.
select * from ABC where column_value=1;
The query doesn't specify any sorting of the returned rows.
SQL is a language that handle sets of tuples and a set is, by definition, an unordered collection of items.
The fact that, under some circumstances, one database engine or another returns the rows in a certain order (sorted by the value of the PK f.e.) is an implementation detail. It is not required by the language and it can change any time.
Even more, when the query doesn't specify an order for the returned rows, the database engine uses whatever method it finds more appropriate to get them fast. The order may depend on external factors and it may change over time. For example, if you remove from the table the rows returned by the query then insert them again but in a different order, a subsequent run of the same query may (and it most probably does) return the rows in a different order than before.
As an insight (that is neither exact, nor reliable), for a query that doesn't contain an ORDER BY clause over a small table, the database returns the rows in the order it finds them in the table data because it doesn't read the index.
For small tables the engine skips reading the index when it is not needed and goes directly to the table data. This way it spares a disk access that doesn't provide any additional value to the processing.
By default id ie. default primary key will be used for "order by", might be possible that you have deleted some rows?
select * from ABC order by ID where column_value=1;
You can use order by feature to obtain the desired result.

Is there any way to fetch the last N rows from a MySQL table without using auto-increment field or timestamp?

There are many solutions in stackoverflow itself where the objective was to read the last n rows of the table using either an auto-increment field or timestamp: for example, the following query fetches the last ten records from a table named tab in the descending order of the field values named id which is an auto increment field in the table:
Select * from tab order by id desc limit 10
My question is: Is there any alternative way without having to get an auto increment field or timestamp to accomplish the task to get the same output?
Tips: The motivation to ask this question comes from the fact that: as we store records into tables and when query the database with a simple query without specifying any criteria like :
Select * from tab
Then the order of the output is same as the order of the records as inserted into the table. So is there any way to get the records in the reverse order of what they were entered into the database?
Data in mysql is not ordered- you don't have any guarantee on the order of the records you'll get unless you'll specify order by in your query.
So no, unless you'll order by timestamp, id, or any other field, you can't get the last rows, simply because there's no 'last' without the order
In the SQL world, order is not an inherent property of a set of data.
Thus, you get no guarantees from your RDBMS that your data will come
back in a certain order -- or even in a consistent order -- unless you
query your data with an ORDER BY clause.
So if you don't have the data sorted by some id or some column then you cannot track the data based on its sorting. So it is not guaranteed how MYSQL will store the data and hence you cannot get the last n records.
You can also check this article:
Caveats
Ordering of Rows
In the absence of ORDER BY, records may be returned in a different
order than the previous MEMORY implementation.
This is not a bug. Any application relying on a specific order without
an ORDER BY clause may deliver unexpected results. A specific order
without ORDER BY is a side effect of a storage engine and query
optimizer implementation which may and will change between minor MySQL
releases.

Questionable SQL practice - Order By id rather than creation time

So I have an interesting question that I am not sure is considered a 'hack' or not. I looked through some questions but did not find a duplicate so here it is. Basically, I need to know if this is unreliable or considered bad practice.
I have a very simple table with a unique auto incrementing id and a created_at timestamp.
(simplified version of my problem to clarify the concept in question)
+-----------+--------------------+
| id |created_at |
+-----------+--------------------+
| 1 |2012-12-11 20:35:19 |
| 2 |2012-12-12 20:35:19 |
| 3 |2012-12-13 20:35:19 |
| 4 |2012-12-14 20:35:19 |
+-----------+--------------------+
Both of these columns are added dynamically so it can be said that a new 'insert' will ALWAYS have a greater id and ALWAYS have a greater date.
OBJECTIVE -
very simply grab the results ordered by created_at in descending order
SOLUTION ONE - A query that orders by date in descending order
SELECT * FROM tablename
ORDER BY created_at DESC
SOLUTION TWO - A query that orders by ID in descending order
SELECT * FROM tablename
ORDER BY id DESC
Is solution two considered bad practice? Or is solution two the proper way of doing things. Any explanation of your reasonings would be very helpful as I am trying to understand the concept, not just simply get an answer. Thanks in advance.
In typical practice you can almost always assume that an autoincrement id can be sorted to give you the records in creation order (either direction). However, you should note that this is not considered portable in terms of your data. You might move your data to another system where the keys are recreated, but the created_at data is the same.
There is actually a pretty good StackOverflow discussion of this issue.
The basic summary is the first solution, ordering by created_at, is considered best practice. Be sure, however, to properly index the created_at field to give the best performance.
You shouldn't rely on ID for anything other than that it uniquely identifies a row. It's an arbitrary number that only happens to correspond to the order in which the records were created.
Say you have this table
ID creation_date
1 2010-10-25
2 2010-10-26
3 2012-03-05
In this case, sorting on ID instead of creation_date works.
Now in the future you realize, oh, whoops, you have to change the creation date of of record ID #2 to 2010-09-17. Your sorts using ID now report the records in the same order:
1 2010-10-25
2 2010-09-17
3 2012-03-05
even though with the new date they should be:
2 2010-09-17
1 2010-10-25
3 2012-03-05
Short version: Use data columns for the purpose that they were created. Don't rely on side effects of the data.
There are a couple of differences between the two options.
The first is that they can give different results.
The value of created_at might be affected by the time being adjusted on the server but the id column will be unaffected. If the time is adjusted backwards (either manually or automatically by time synchronization software) you could get records that were inserted later but with timestamps that are before records that were inserted earlier. In this case you will get a different order depending on which column you order by. Which order you consider to be "correct" is up to you.
The second is performance. It is likely to be faster to ORDER BY your clustered index.
How the Clustered Index Speeds Up Queries
Accessing a row through the clustered index is fast because the row data is on the same page where the index search leads.
By default the clustered key is the primary key, which in your case is presumably the id column. You will probably find that ORDER BY id is slightly faster than ORDER BY created_at.
Primary keys, especially of surrogate type, do not usually represent any kind of meaningful data aside from the fact that their mere function is to allow for uniquely identifiable records. Since dates in this case do represent meaningful data that has meaning outside of its primary function I'd say sorting according to dates is a more logical approach here.
Ordering by id orders by insertion order.
If you have use cases where insertion can be delayed, for example a batch process, then you must order by created_at to order by time.
Both are acceptable if they meet you needs.

What is the best way to sort by columns in mysql and use index?

I have a table with 10 columns, Now I want to give the users an option to sort the data with any column they want. For example suppose a combo box with 7 items that each of them is a column of the table, now the user choose one item and get the data sorted by the chosen column.
Now what is the problem?
My table has 3M records, and if I sort the data with indexed column I have no problem but with a non index column it takes 3.5mins to sort!!!
What is the solution I am thinking about?
Add index to every column of table that is needed to be sort by! In my case I will have index on 8 columns!!!!
What is the problem of my solution?
Having a lot of index on columns may decrease the speed of INSERT/UPDATE queries! In my case the table is updated frequently (every second!!!!!)
What is your solution for this case?!
Read this for more details on optimization: http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that match the WHERE clause. Using index for sorting often comes together with using index to find rows, however it can also be used just for sort for example if you’re just using ORDER BY without and where clauses on the table. In such case you would see “Index” type in EXPLAIN which correspond to scanning (potentially) complete table in the index order. It is very important to understand in which conditions index can be used to sort data together with restricting amount of rows.
Looking at the same index (A,B) things like ORDER BY A ; ORDER BY A,B ; ORDER BY A DESC, B DESC will be able to use full index for sorting (note MySQL may not select to use index for sort if you sort full table without a limit). However ORDER BY B or ORDER BY A, B DESC will not be able to use index because requested order does not line up with the order of data in BTREE. If you have both restriction and sorting things like this would work A=5 ORDER BY B ; A=5 ORDER BY B DESC; A>5 ORDER BY A ; A>5 ORDER BY A,B ; A>5 ORDER BY A DESC which again can be easily visualized as scanning a range in BTREE. Things like this however would not work A>5 ORDER BY B , A>5 ORDER BY A,B DESC or A IN (3,4) ORDER BY B – in these cases getting data in sorting form would require a bit more than simple range scan in the BTREE and MySQL decides to pass it on.
Option #1: If you are limited to MySQL there's no better option but create 8 indexes for the possible order columns. You're insert/update are going to suffer it for sure but no real visitor will wait for 3.5 minutes for a list to be sorted.
Tune #1: To make it a little faster you can create partial indexes instead of standard indexes which will use much less space (I assume some of these columns are varchar) and this means less writes, smaller footprint in memory. You just need to check the entropy for each column with the substring and make sure you still have distinction over 90%.
For example with a query like this:
> select count(distinct(substring(COLUMN, 1, 5))) as part_5, count(distinct(substring(COLUMN, 1, 10))) as part_10, count(distinct(substring(COLUMN, 1, 20))) as part_20, count(distinct(COLUMN)) as sum from TABLE;
+--------+---------+---------+---------+
| part_5 | part_10 | part_20 | sum |
+--------+---------+---------+---------+
| 892183 | 1996053 | 1996058 | 1996058 |
+--------+---------+---------+---------+
Tune #2: You can make you insert/update statements to execute in the background. The application won't be faster but the user experience is going to be much better.
Tune #3: Use bigger transactions if you can for the inserts/updates.
Option #2: You can try to use one of the search engines which have been built for this usage pattern (too). I would recommend Solr as I'm using it for a while with great satisfaction but I heard good about elastic search as well.

avoid Sorting by the MYSQL IN Keyword

When querying the db for a set of ids, mysql doesnot provide the results in the order by which the ids were specified. The query i am using is the following:
SELECT id ,title, date FROM Table WHERE id in (7,1,5,9,3)
in return the result provided is in the order 1,3,5,7,9.
How can i avoid this auto sorting
If you want to order your result by id in the order specified in the in clause you can make use of FIND_IN_SET as:
SELECT id ,title, date
FROM Table
WHERE id in (7,1,5,9,3)
ORDER BY FIND_IN_SET(id,'7,1,5,9,3')
There is no auto-sorting or default sorting going on. The sorting you're seeing is most likely the natural sorting of rows within the table, ie. the order they were inserted. If you want the results sorted in some other way, specify it using an ORDER BY clause. There is no way in SQL to specify that a sort order should follow the ordering of items in an IN clause.
The WHERE clause in SQL does not affect the sort order; the ORDER BY clause does that.
If you don't specify a sort order using ORDER BY, SQL will pick its own order, which will typically be the order of the primary key, but could be anything.
If you want the records in a particular order, you need to specify an ORDER BY clause that tells SQL the order you want.
If the order you want is based solely on that odd sequence of IDs, then you'd need to specify that in the ORDER BY clause. It will be tricky to specify exactly that. It is possible, but will need some awkward SQL code, and will slow down the query significantly (due to it no longer using a key to find the records).
If your desired ID sequence is because of some other factor that is more predictable (say for example, you actually want the records in alphabetical name order), you can just do ORDER BY name (or whatever the field is).
If you really want to sort by the ID in an arbitrary sequence, you may need to generate a temporary field which you can use to sort by:
SELECT *,
CASE id
WHEN 7 THEN 1
WHEN 1 THEN 2
WHEN 5 THEN 3
WHEN 3 THEN 4
WHEN 9 THEN 5
END AS mysortorder
FROM mytable
WHERE id in (7,1,5,9,3)
ORDER BY mysortorder;
The behaviour you are seeing is a result of query optimisation, I expect that you have an index on id so that the IN statement will use the index to return records in the most efficient way. As an ORDER BY statement has not been specified the database will assume that the order of the return records is not important and will optimise for speed. (Checkout "EXPLAIN SELECT")
CodeAddicts or Spudley's answer will give the result you want. An alternative is assigning a priority to the id's in "mytable" (or another table) and using this to order the records as desired.