PostgreSQL equivalent for MySQL GROUP BY

PostgreSQL equivalent for MySQL GROUP BY - mysql

I need to find duplicates in a table. In MySQL I simply write:
SELECT *,count(id) count FROM `MY_TABLE`
GROUP BY SOME_COLUMN ORDER BY count DESC
This query nicely:
Finds duplicates based on SOME_COLUMN, giving its repetition count.
Sorts in desc order of repetition, which is useful to quickly scan major dups.
Chooses a random value for all remaining columns, giving me an idea of values in those columns.
Similar query in Postgres greets me with an error:
column "MY_TABLE.SOME_COLUMN" must appear in the GROUP BY clause or be
used in an aggregate function
What is the Postgres equivalent of this query?
PS: I know that MySQL behaviour deviates from SQL standards.

Back-ticks are a non-standard MySQL thing. Use the canonical double quotes to quote identifiers (possible in MySQL, too). That is, if your table in fact is named "MY_TABLE" (all upper case). If you (more wisely) named it my_table (all lower case), then you can remove the double quotes or use lower case.
Also, I use ct instead of count as alias, because it is bad practice to use function names as identifiers.
Simple case
This would work with PostgreSQL 9.1:
SELECT *, count(id) ct
FROM my_table
GROUP BY primary_key_column(s)
ORDER BY ct DESC;
It requires primary key column(s) in the GROUP BY clause. The results are identical to a MySQL query, but ct would always be 1 (or 0 if id IS NULL) - useless to find duplicates.
Group by other than primary key columns
If you want to group by other column(s), things get more complicated. This query mimics the behavior of your MySQL query - and you can use *.
SELECT DISTINCT ON (1, some_column)
count(*) OVER (PARTITION BY some_column) AS ct
,*
FROM my_table
ORDER BY 1 DESC, some_column, id, col1;
This works because DISTINCT ON (PostgreSQL specific), like DISTINCT (SQL-Standard), are applied after the window function count(*) OVER (...). Window functions (with the OVER clause) require PostgreSQL 8.4 or later and are not available in MySQL.
Works with any table, regardless of primary or unique constraints.
The 1 in DISTINCT ON and ORDER BY is just shorthand to refer to the ordinal number of the item in the SELECT list.
SQL Fiddle to demonstrate both side by side.
More details in this closely related answer:
Select first row in each GROUP BY group?
count(*) vs. count(id)
If you are looking for duplicates, you are better off with count(*) than with count(id). There is a subtle difference if id can be NULL, because NULL values are not counted - while count(*) counts all rows. If id is defined NOT NULL, results are the same, but count(*) is generally more appropriate (and slightly faster, too).

Here's another approach, uses DISTINCT ON:
select
distinct on(ct, some_column)
*,
count(id) over(PARTITION BY some_column) as ct
from my_table x
order by ct desc, some_column, id
Data source:
CREATE TABLE my_table (some_column int, id int, col1 int);
INSERT INTO my_table VALUES
(1, 3, 4)
,(2, 4, 1)
,(2, 5, 1)
,(3, 6, 4)
,(3, 7, 3)
,(4, 8, 3)
,(4, 9, 4)
,(5, 10, 1)
,(5, 11, 2)
,(5, 11, 3);
Output:
SOME_COLUMN ID COL1 CT
5 10 1 3
2 4 1 2
3 6 4 2
4 8 3 2
1 3 4 1
Live test: http://www.sqlfiddle.com/#!1/e2509/1
DISTINCT ON documentation: http://www.postgresonline.com/journal/archives/4-Using-Distinct-ON-to-return-newest-order-for-each-customer.html

mysql allows group by to omit non-aggregated selected columns from the group by list, which it executes by returning the first row found for each unique combination of grouped by columns. This is non-standard SQL behaviour.
postgres on the other hand is SQL standard compliant.
There is no equivalent query in postgres.

Here is a self-joined CTE, which allows you to use select *. key0 is the intended unique key, {key1,key2} are the additional key elements needed to address the currently non-unique rows. Use at your own risk, YMMV.
WITH zcte AS (
SELECT DISTINCT tt.key0
, MIN(tt.key1) AS key1
, MIN(tt.key2) AS key2
, COUNT(*) AS cnt
FROM ztable tt
GROUP BY tt.key0
HAVING COUNT(*) > 1
)
SELECT zt.*
, zc.cnt AS cnt
FROM ztable zt
JOIN zcte zc ON zc.key0 = zt.key0 AND zc.key1 = zt.key1 AND zc.key2 = zt.key2
ORDER BY zt.key0, zt.key1,zt.key2
;
BTW: to get the intended behaviour for the OP, the HAVING COUNT(*) > 1 clause should be omitted.

Related

Is it possible to query MySQL to get only fields that contain duplicate/repeating strings?

What I mean is, I have table with a "list" column. The data that goes into the "list" is related to addresses, so I sometimes get repeated zip codes for one record in that field.
For example, "12345,12345,12345,12456".
I want to know if it's possible to construct a query that would find the records that have an unknown string that duplicates within the field, such that I would get the records like "12345,12345,12345,12456", but not ones like "12345,45678,09876".
I hope that makes sense.

Yes, it is possible. You need to use a numbers table to convert your delimited string into rows, then use group by to find duplicates, e.g.
CREATE TABLE T (ID INT, List VARCHAR(100));
INSERT INTO T (ID, List)
VALUES (1, '12345,12345,12345,12456'), (2, '12345,45678,09876');
SELECT
T.ID,
SUBSTRING_INDEX(SUBSTRING_INDEX(T.list, ',', n.Number), ',', -1) AS ListItem
FROM T
INNER JOIN
( SELECT 1 AS Number UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) AS n
ON CHAR_LENGTH(T.list)-CHAR_LENGTH(REPLACE(T.list, ',', ''))>=n.Number-1
GROUP BY T.ID, ListItem
HAVING COUNT(*) > 1;
If you don't have a numbers table you can create one in a derived query as I have above with UNION ALL
Example on DB Fiddle
With that being said, this is almost certainly not the right way to store your data, you should instead use a child table, e.g.
CREATE TABLE ListItems
(
MainTableId INT NOT NULL, --Foreign Key to your current table
ItemName VARCHAR(10) NOT NULL -- Or whatever data type you need
);
Then your query is much more simple:
SELECT T.ID, li.ItemName
FROM T
INNER JOIN ListItems AS li
ON li.MainTableId = T.ID
GROUP BY T.ID, li.ItemName
HAVING COUNT(*) > 1;
If you need to recreate your original format, this is easily done with GROUP_CONCAT():
SELECT T.ID,
GROUP_CONCAT(li.ItemName) AS List
FROM T
INNER JOIN ListItems AS li
ON li.MainTableId = T.ID
GROUP BY T.ID;
Example on DB Fiddle

I am still unclear what your desired result is based on your question however if it is simply to get all rows where there is a duplicate entry in column list you could do the following:
SELECT * FROM TABLE
WHERE COLUMN IN
(SELECT COLUMN FROM TABLE
having count(*) >1)

Multiple select in select

I have a code like this:
SELECT column1 = (SELECT MAX(column-name21) FROM table-name2 WHERE condition2 GROUP BY id2) as m,
column2 = (SELECT count(*) FROM table-name2 WHERE condition2 GROUP BY id2) as c,
column-names
FROM table-name
WHERE condition
ORDER BY ordercondition
LIMIT 25,50
those internal selects are quite long and complicated.
My question is are there in mysql language contracts, which allow one to avoid duplicating code and computations in this case?
For example, something like this
SELECT (column1, column2) = (SELECT MAX(column-name1) as m, count(*) as c FROM table-name WHERE condition GROUP BY id),
column-names
FROM table-name
WHERE condition
ORDER BY ordercondition
LIMIT 25,50
which of course won't be interpreted by mysql.
I tried this:
SELECT (SELECT MAX(column-name1) as column1, count(*) as column2 FROM table-name WHERE condition GROUP BY id),
column-names
FROM table-name
WHERE condition
ORDER BY ordercondition
LIMIT 25,50
and it also doesn't work.

Such subqueries get cumbersome when you need more than one from the same source. Usually, the "fix" is to us a "derived table" and JOIN:
SELECT x2.col1, x2.col2, names
FROM ( SELECT MAX(c21) AS col1,
COUNT(*) AS col2,
?? -- may be needed for "cond2"
FROM t2
WHERE cond2a ) AS x2
JOIN t1
ON cond2b
WHERE cond1
ORDER BY ??? -- Limit is non-deterministic without ORDER BY
LIMIT 25, 50
If the "condition" in the subquery is "correlated", please specify it; it makes a big difference in how to transform the query.
The construct COUNT(col) is usually a mistake:
COUNT(*) -- the number of rows.
COUNT(DISTINCT col) -- the number of different values in column `col`.
COUNT(col) -- count the number of rows with non-NULL `col`.
Please provide your actual query and provide SHOW CREATE TABLE. I sloughed over several issues; "the devil is in the details".
for Edit 1
INDEX(tool, uuuuId) -- would help performance
Is uuuuId some form of "hash" or "UUID"? If so, that is relevant to seeing how the performance works. Also, how big (approximately) are the tables? What is the value of innodb_buffer_pool_size. (I am fishing for whether you are I/O-bound or CPU-bound.)
WZ needs INDEX(uuuuId, ppppppId, check1) But actually, that Select...=Yes can be turned and EXISTS for some speedup.
Z might benefit from INDEX(check1, uuuuId, ppppppId, check2)
Since Z and WZ are the same table, this might take care of both:
INDEX(ppppppId, uuuuId, check1, check2)
(The order is important.)

How to use FIND_IN_SET using list of data

I have used FIND_IN_SET multiple times before but this case is a bit different.
Earlier I was searching a single value in the table like
SELECT * FROM tbl_name where find_in_set('1212121212', sku)
But now I have the list of SKUs which I want to search in the table. E.g
'3698520147','088586004490','868332000057','081308003405','088394000028','089541300893','0732511000148','009191711092','752830528161'
I have two columns in the table SKU LIKE 081308003405 and SKU Variation
In SKU column I am saving single value but in variation column I am saving the value in the comma-separated format LIKE 081308003405,088394000028,089541300893
SELECT * FROM tbl_name
WHERE 1
AND upc IN ('3698520147','088586004490','868332000057','081308003405','088394000028',
'089541300893','0732511000148','009191711092','752830528161')
I am using IN function to search UPC value now I want to search variation as well in the variation column. This is my concern is how to search using SKU list in variation column
For now, I have to check in the loop for UPC variation which is taking too much time. Below is the query
SELECT id FROM products
WHERE 1 AND upcVariation AND FIND_IN_SET('88076164444',upc_variation) > 0

First of all consider to store the data in a normalized way. Here is a good read: Is storing a delimited list in a database column really that bad?
Now - Assumng the following schema and data:
create table products (
id int auto_increment,
upc varchar(50),
upc_variation text,
primary key (id),
index (upc)
);
insert into products (upc, upc_variation) values
('01234', '01234,12345,23456'),
('56789', '45678,34567'),
('056789', '045678,034567');
We want to find products with variations '12345' and '34567'. The expected result is the 1st and the 2nd rows.
Normalized schema - many-to-many relation
Instead of storing the values in a comma separated list, create a new table, which maps product IDs with variations:
create table products_upc_variations (
product_id int,
upc_variation varchar(50),
primary key (product_id, upc_variation),
index (upc_variation, product_id)
);
insert into products_upc_variations (product_id, upc_variation) values
(1, '01234'),
(1, '12345'),
(1, '23456'),
(2, '45678'),
(2, '34567'),
(3, '045678'),
(3, '034567');
The select query would be:
select distinct p.*
from products p
join products_upc_variations v on v.product_id = p.id
where v.upc_variation in ('12345', '34567');
As you see - With a normalized schema the problem can be solved with a quite basic query. And we can effectively use indices.
"Exploiting" a FULLTEXT INDEX
With a FULLTEXT INDEX on (upc_variation) you can use:
select p.*
from products p
where match (upc_variation) against ('12345 34567');
This looks quite "pretty" and is probably efficient. But though it works for this example, I wouldn't feel comfortable with this solution, because I can't say exactly, when it doesn't work.
Using JSON_OVERLAPS()
Since MySQL 8.0.17 you can use JSON_OVERLAPS(). You should either store the values as a JSON array, or convert the list to JSON "on the fly":
select p.*
from products p
where json_overlaps(
'["12345","34567"]',
concat('["', replace(upc_variation, ',', '","'), '"]')
);
No index can be used for this. But neither can for FIND_IN_SET().
Using JSON_TABLE()
Since MySQL 8.0.4 you can use JSON_TABLE() to generate a normalized representation of the data "on the fly". Here again you would either store the data in a JSON array, or convert the list to JSON in the query:
select distinct p.*
from products p
join json_table(
concat('["', replace(p.upc_variation, ',', '","'), '"]'),
'$[*]' columns (upcv text path '$')
) v
where v.upcv in ('12345', '34567');
No index can be used here. And this is probably the slowest solution of all presented in this answer.
RLIKE / REGEXP
You can also use a regular expression:
select p.*
from products p
where p.upc_variation rlike '(^|,)(12345|34567)(,|$)'
See demo of all queries on dbfiddle.uk

You can try with below example:
SELECT * FROM TABLENAME
WHERE 1 AND ( FIND_IN_SET('3698520147', SKU)
OR UPC IN ('3698520147') )

I have a solution for you, you can consider this solution:
1: Create a temporary table example here: Sql Fiddle
select
tablename.id,
SUBSTRING_INDEX(SUBSTRING_INDEX(tablename.name, ',', numbers.n), ',', -1) sku_variation
from
numbers inner join tablename
on CHAR_LENGTH(tablename.sku_split)
-CHAR_LENGTH(REPLACE(tablename.sku_split, ',', ''))>=numbers.n-1
order by id, n
2: Use the temporary table to filter. find in set with your data

Performance considerations. The main thing that matters for performance is whether some index can be used. The complexity of the expression has only a minuscule impact on overall performance.
Step 1 is to learn what can be optimized, and in what way:
Equal: WHERE x = 1 -- can use index
IN/1: WHERE x IN (1) -- Turned into the Equal case by Optimizer
IN/many: WHERE x IN (22,33,44) -- Usually worse than Equal and better than "range"
Easy OR: WHERE (x = 22 OR x = 33) -- Turned into IN if possible
General OR: WHERE (sku = 22 OR upc = 33) -- not sargable (cf UNION)
Easy LIKE: WHERE x LIKE 'abc' -- turned into Equal
Range LIKE: WHERE x LIKE 'abc%' -- equivalent to "range" test
Wild LIKE: WHERE x LIKE '%abc%' -- not sargable
REGEXP: WHERE x RLIKE 'aaa|bbb|ccc' -- not sargable
FIND_IN_SET: WHERE FIND_IN_SET(x, '22,33,44') -- not sargable, even for single item
JSON: -- not sargable
FULLTEXT: WHERE MATCH(x) AGAINST('aaa bbb ccc') -- fast, but not equivalent
NOT: WHERE NOT ((any of the above)) -- usually poor performance
"Sargable" -- able to use index. Phrased differently "Hiding the column in a function call" prevents using an index.
FULLTEXT: There are many restrictions: "word-oriented", min word size, stopwords, etc. But it is very fast when it applies. Note: When used with outer tests, MATCH comes first (if possible), then further filtering will be done without the benefit of indexes, but on a smaller set of rows.
Even when an expression "can" use an index, it "may not". Whether a WHERE clause makes good use of an index is a much longer discussion than can be put here.
Step 2 Learn how to build composite indexes when you have multiple tests (WHERE ... AND ...):
When constructing a composite (multi-column) index, include columns in this order:
'Equal' -- any number of such columns.
'IN/many' column(s)
One range test (BETWEEN, <, etc)
(A couple of side notes.) The Optimizer is smart enough to clean up WHERE 1 AND .... But there are not many things that the Optimizer will handle. In particular, this is not sargable: `AND DATE(x) = '2020-02-20', but this does optimize as a "range":
AND x >= '2020-02-20'
AND x < '2020-02-20' + INTERVAL 1 DAY
Reading
Building indexes: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
Sargable: https://en.wikipedia.org/wiki/Sargable
Tips on Many-to-many: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

This depends on how you use it. In MySQL I found that find_in_set is way faster than using JSON when tested on the following commands, so much faster it wasn't even a competition (to be clear, the speed test did not include the set command line):
Fastest
set #ids = (select group_concat(`ID`) from `table`);
select count(*) from `table` where find_in_set(`ID`, #ids);
10 x slower
set #ids = (select json_arrayagg(`ID`) from `table`);
select count(*) from `table` where `ID` member of( #ids );
34 x slower
set #ids = (select json_arrayagg(`ID`) from `table`);
select count(*) from `table` where JSON_CONTAINS(#ids, convert(`ID`, char));
34 x slower
set #ids = (select json_arrayagg(`ID`) from `table`);
select count(*) from `table` where json_overlaps(#ids, json_array(`ID`));

SELECT * FROM tbl_name t1,(select
group_concat('3698520147',',','088586004490',',','868332000057',',',
'081308003405',',','088394000028',',','089541300893',',','0732511000148',',','009191711092',
',','752830528161') as skuid)t
WHERE FIND_IN_SET(t1.sku,t.skuid)>0

How to get fifth highest salary from salary table by single query? [duplicate]

I'm interested in learning some (ideally) database agnostic ways of selecting the nth row from a database table. It would also be interesting to see how this can be achieved using the native functionality of the following databases:
SQL Server
MySQL
PostgreSQL
SQLite
Oracle
I am currently doing something like the following in SQL Server 2005, but I'd be interested in seeing other's more agnostic approaches:
WITH Ordered AS (
SELECT ROW_NUMBER() OVER (ORDER BY OrderID) AS RowNumber, OrderID, OrderDate
FROM Orders)
SELECT *
FROM Ordered
WHERE RowNumber = 1000000
Credit for the above SQL: Firoz Ansari's Weblog
Update: See Troels Arvin's answer regarding the SQL standard. Troels, have you got any links we can cite?

There are ways of doing this in optional parts of the standard, but a lot of databases support their own way of doing it.
A really good site that talks about this and other things is http://troels.arvin.dk/db/rdbms/#select-limit.
Basically, PostgreSQL and MySQL supports the non-standard:
SELECT...
LIMIT y OFFSET x
Oracle, DB2 and MSSQL supports the standard windowing functions:
SELECT * FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY key ASC) AS rownumber,
columns
FROM tablename
) AS foo
WHERE rownumber <= n
(which I just copied from the site linked above since I never use those DBs)
Update: As of PostgreSQL 8.4 the standard windowing functions are supported, so expect the second example to work for PostgreSQL as well.
Update: SQLite added window functions support in version 3.25.0 on 2018-09-15 so both forms also work in SQLite.

PostgreSQL supports windowing functions as defined by the SQL standard, but they're awkward, so most people use (the non-standard) LIMIT / OFFSET:
SELECT
*
FROM
mytable
ORDER BY
somefield
LIMIT 1 OFFSET 20;
This example selects the 21st row. OFFSET 20 is telling Postgres to skip the first 20 records. If you don't specify an ORDER BY clause, there's no guarantee which record you will get back, which is rarely useful.

I'm not sure about any of the rest, but I know SQLite and MySQL don't have any "default" row ordering. In those two dialects, at least, the following snippet grabs the 15th entry from the_table, sorting by the date/time it was added:
SELECT *
FROM the_table
ORDER BY added DESC
LIMIT 1,15
(of course, you'd need to have an added DATETIME field, and set it to the date/time that entry was added...)

SQL 2005 and above has this feature built-in. Use the ROW_NUMBER() function. It is excellent for web-pages with a << Prev and Next >> style browsing:
Syntax:
SELECT
*
FROM
(
SELECT
ROW_NUMBER () OVER (ORDER BY MyColumnToOrderBy) AS RowNum,
*
FROM
Table_1
) sub
WHERE
RowNum = 23

I suspect this is wildly inefficient but is quite a simple approach, which worked on a small dataset that I tried it on.
select top 1 field
from table
where field in (select top 5 field from table order by field asc)
order by field desc
This would get the 5th item, change the second top number to get a different nth item
SQL server only (I think) but should work on older versions that do not support ROW_NUMBER().

Verify it on SQL Server:
Select top 10 * From emp
EXCEPT
Select top 9 * From emp
This will give you 10th ROW of emp table!

Contrary to what some of the answers claim, the SQL standard is not silent regarding this subject.
Since SQL:2003, you have been able to use "window functions" to skip rows and limit result sets.
And in SQL:2008, a slightly simpler approach had been added, using
OFFSET skip ROWS
FETCH FIRST n ROWS ONLY
Personally, I don't think that SQL:2008's addition was really needed, so if I were ISO, I would have kept it out of an already rather large standard.

1 small change: n-1 instead of n.
select *
from thetable
limit n-1, 1

SQL SERVER
Select n' th record from top
SELECT * FROM (
SELECT
ID, NAME, ROW_NUMBER() OVER(ORDER BY ID) AS ROW
FROM TABLE
) AS TMP
WHERE ROW = n
select n' th record from bottom
SELECT * FROM (
SELECT
ID, NAME, ROW_NUMBER() OVER(ORDER BY ID DESC) AS ROW
FROM TABLE
) AS TMP
WHERE ROW = n

When we used to work in MSSQL 2000, we did what we called the "triple-flip":
EDITED
DECLARE #InnerPageSize int
DECLARE #OuterPageSize int
DECLARE #Count int
SELECT #Count = COUNT(<column>) FROM <TABLE>
SET #InnerPageSize = #PageNum * #PageSize
SET #OuterPageSize = #Count - ((#PageNum - 1) * #PageSize)
IF (#OuterPageSize < 0)
SET #OuterPageSize = 0
ELSE IF (#OuterPageSize > #PageSize)
SET #OuterPageSize = #PageSize
DECLARE #sql NVARCHAR(8000)
SET #sql = 'SELECT * FROM
(
SELECT TOP ' + CAST(#OuterPageSize AS nvarchar(5)) + ' * FROM
(
SELECT TOP ' + CAST(#InnerPageSize AS nvarchar(5)) + ' * FROM <TABLE> ORDER BY <column> ASC
) AS t1 ORDER BY <column> DESC
) AS t2 ORDER BY <column> ASC'
PRINT #sql
EXECUTE sp_executesql #sql
It wasn't elegant, and it wasn't fast, but it worked.

In Oracle 12c, You may use OFFSET..FETCH..ROWS option with ORDER BY
For example, to get the 3rd record from top:
SELECT *
FROM sometable
ORDER BY column_name
OFFSET 2 ROWS FETCH NEXT 1 ROWS ONLY;

Here is a fast solution of your confusion.
SELECT * FROM table ORDER BY `id` DESC LIMIT N, 1
Here You may get Last row by Filling N=0, Second last by N=1, Fourth Last By Filling N=3 and so on.
This is very common question over the interview and this is Very simple ans of it.
Further If you want Amount, ID or some Numeric Sorting Order than u may go for CAST function in MySQL.
SELECT DISTINCT (`amount`)
FROM cart
ORDER BY CAST( `amount` AS SIGNED ) DESC
LIMIT 4 , 1
Here By filling N = 4 You will be able to get Fifth Last Record of Highest Amount from CART table. You can fit your field and table name and come up with solution.

ADD:
LIMIT n,1
That will limit the results to one result starting at result n.

Oracle:
select * from (select foo from bar order by foo) where ROWNUM = x

For example, if you want to select every 10th row in MSSQL, you can use;
SELECT * FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY ColumnName1 ASC) AS rownumber, ColumnName1, ColumnName2
FROM TableName
) AS foo
WHERE rownumber % 10 = 0
Just take the MOD and change number 10 here any number you want.

For SQL Server, a generic way to go by row number is as such:
SET ROWCOUNT #row --#row = the row number you wish to work on.
For Example:
set rowcount 20 --sets row to 20th row
select meat, cheese from dbo.sandwich --select columns from table at 20th row
set rowcount 0 --sets rowcount back to all rows
This will return the 20th row's information. Be sure to put in the rowcount 0 afterward.

Here's a generic version of a sproc I recently wrote for Oracle that allows for dynamic paging/sorting - HTH
-- p_LowerBound = first row # in the returned set; if second page of 10 rows,
-- this would be 11 (-1 for unbounded/not set)
-- p_UpperBound = last row # in the returned set; if second page of 10 rows,
-- this would be 20 (-1 for unbounded/not set)
OPEN o_Cursor FOR
SELECT * FROM (
SELECT
Column1,
Column2
rownum AS rn
FROM
(
SELECT
tbl.Column1,
tbl.column2
FROM MyTable tbl
WHERE
tbl.Column1 = p_PKParam OR
tbl.Column1 = -1
ORDER BY
DECODE(p_sortOrder, 'A', DECODE(p_sortColumn, 1, Column1, 'X'),'X'),
DECODE(p_sortOrder, 'D', DECODE(p_sortColumn, 1, Column1, 'X'),'X') DESC,
DECODE(p_sortOrder, 'A', DECODE(p_sortColumn, 2, Column2, sysdate),sysdate),
DECODE(p_sortOrder, 'D', DECODE(p_sortColumn, 2, Column2, sysdate),sysdate) DESC
))
WHERE
(rn >= p_lowerBound OR p_lowerBound = -1) AND
(rn <= p_upperBound OR p_upperBound = -1);

But really, isn't all this really just parlor tricks for good database design in the first place? The few times I needed functionality like this it was for a simple one off query to make a quick report. For any real work, using tricks like these is inviting trouble. If selecting a particular row is needed then just have a column with a sequential value and be done with it.

Nothing fancy, no special functions, in case you use Caché like I do...
SELECT TOP 1 * FROM (
SELECT TOP n * FROM <table>
ORDER BY ID Desc
)
ORDER BY ID ASC
Given that you have an ID column or a datestamp column you can trust.

For SQL server, the following will return the first row from giving table.
declare #rowNumber int = 1;
select TOP(#rowNumber) * from [dbo].[someTable];
EXCEPT
select TOP(#rowNumber - 1) * from [dbo].[someTable];
You can loop through the values with something like this:
WHILE #constVar > 0
BEGIN
declare #rowNumber int = #consVar;
select TOP(#rowNumber) * from [dbo].[someTable];
EXCEPT
select TOP(#rowNumber - 1) * from [dbo].[someTable];
SET #constVar = #constVar - 1;
END;

LIMIT n,1 doesn't work in MS SQL Server. I think it's just about the only major database that doesn't support that syntax. To be fair, it isn't part of the SQL standard, although it is so widely supported that it should be. In everything except SQL server LIMIT works great. For SQL server, I haven't been able to find an elegant solution.

In Sybase SQL Anywhere:
SELECT TOP 1 START AT n * from table ORDER BY whatever
Don't forget the ORDER BY or it's meaningless.

T-SQL - Selecting N'th RecordNumber from a Table
select * from
(select row_number() over (order by Rand() desc) as Rno,* from TableName) T where T.Rno = RecordNumber
Where RecordNumber --> Record Number to Select
TableName --> To be Replaced with your Table Name
For e.g. to select 5 th record from a table Employee, your query should be
select * from
(select row_number() over (order by Rand() desc) as Rno,* from Employee) T where T.Rno = 5

SELECT
top 1 *
FROM
table_name
WHERE
column_name IN (
SELECT
top N column_name
FROM
TABLE
ORDER BY
column_name
)
ORDER BY
column_name DESC
I've written this query for finding Nth row.
Example with this query would be
SELECT
top 1 *
FROM
Employee
WHERE
emp_id IN (
SELECT
top 7 emp_id
FROM
Employee
ORDER BY
emp_id
)
ORDER BY
emp_id DESC

I'm a bit late to the party here but I have done this without the need for windowing or using
WHERE x IN (...)
SELECT TOP 1
--select the value needed from t1
[col2]
FROM
(
SELECT TOP 2 --the Nth row, alter this to taste
UE2.[col1],
UE2.[col2],
UE2.[date],
UE2.[time],
UE2.[UID]
FROM
[table1] AS UE2
WHERE
UE2.[col1] = ID --this is a subquery
AND
UE2.[col2] IS NOT NULL
ORDER BY
UE2.[date] DESC, UE2.[time] DESC --sorting by date and time newest first
) AS t1
ORDER BY t1.[date] ASC, t1.[time] ASC --this reverses the order of the sort in t1
It seems to work fairly fast although to be fair I only have around 500 rows of data
This works in MSSQL

SELECT * FROM emp a
WHERE n = (
SELECT COUNT( _rowid)
FROM emp b
WHERE a. _rowid >= b. _rowid
);

unbelievable that you can find a SQL engine executing this one ...
WITH sentence AS
(SELECT
stuff,
row = ROW_NUMBER() OVER (ORDER BY Id)
FROM
SentenceType
)
SELECT
sen.stuff
FROM sentence sen
WHERE sen.row = (ABS(CHECKSUM(NEWID())) % 100) + 1

select * from
(select * from ordered order by order_id limit 100) x order by
x.order_id desc limit 1;
First select top 100 rows by ordering in ascending and then select last row by ordering in descending and limit to 1. However this is a very expensive statement as it access the data twice.

It seems to me that, to be efficient, you need to 1) generate a random number between 0 and one less than the number of database records, and 2) be able to select the row at that position. Unfortunately, different databases have different random number generators and different ways to select a row at a position in a result set - usually you specify how many rows to skip and how many rows you want, but it's done differently for different databases. Here is something that works for me in SQLite:
select *
from Table
limit abs(random()) % (select count(*) from Words), 1;
It does depend on being able to use a subquery in the limit clause (which in SQLite is LIMIT <recs to skip>,<recs to take>) Selecting the number of records in a table should be particularly efficient, being part of the database's meta data, but that depends on the database's implementation. Also, I don't know if the query will actually build the result set before retrieving the Nth record, but I would hope that it doesn't need to. Note that I'm not specifying an "order by" clause. It might be better to "order by" something like the primary key, which will have an index - getting the Nth record from an index might be faster if the database can't get the Nth record from the database itself without building the result set.

Most suitable answer I have seen on this article for sql server
WITH myTableWithRows AS (
SELECT (ROW_NUMBER() OVER (ORDER BY myTable.SomeField)) as row,*
FROM myTable)
SELECT * FROM myTableWithRows WHERE row = 3

How to select the nth row in a SQL database table?

I'm interested in learning some (ideally) database agnostic ways of selecting the nth row from a database table. It would also be interesting to see how this can be achieved using the native functionality of the following databases:
SQL Server
MySQL
PostgreSQL
SQLite
Oracle
I am currently doing something like the following in SQL Server 2005, but I'd be interested in seeing other's more agnostic approaches:
WITH Ordered AS (
SELECT ROW_NUMBER() OVER (ORDER BY OrderID) AS RowNumber, OrderID, OrderDate
FROM Orders)
SELECT *
FROM Ordered
WHERE RowNumber = 1000000
Credit for the above SQL: Firoz Ansari's Weblog
Update: See Troels Arvin's answer regarding the SQL standard. Troels, have you got any links we can cite?

There are ways of doing this in optional parts of the standard, but a lot of databases support their own way of doing it.
A really good site that talks about this and other things is http://troels.arvin.dk/db/rdbms/#select-limit.
Basically, PostgreSQL and MySQL supports the non-standard:
SELECT...
LIMIT y OFFSET x
Oracle, DB2 and MSSQL supports the standard windowing functions:
SELECT * FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY key ASC) AS rownumber,
columns
FROM tablename
) AS foo
WHERE rownumber <= n
(which I just copied from the site linked above since I never use those DBs)
Update: As of PostgreSQL 8.4 the standard windowing functions are supported, so expect the second example to work for PostgreSQL as well.
Update: SQLite added window functions support in version 3.25.0 on 2018-09-15 so both forms also work in SQLite.

PostgreSQL supports windowing functions as defined by the SQL standard, but they're awkward, so most people use (the non-standard) LIMIT / OFFSET:
SELECT
*
FROM
mytable
ORDER BY
somefield
LIMIT 1 OFFSET 20;
This example selects the 21st row. OFFSET 20 is telling Postgres to skip the first 20 records. If you don't specify an ORDER BY clause, there's no guarantee which record you will get back, which is rarely useful.

I'm not sure about any of the rest, but I know SQLite and MySQL don't have any "default" row ordering. In those two dialects, at least, the following snippet grabs the 15th entry from the_table, sorting by the date/time it was added:
SELECT *
FROM the_table
ORDER BY added DESC
LIMIT 1,15
(of course, you'd need to have an added DATETIME field, and set it to the date/time that entry was added...)

SQL 2005 and above has this feature built-in. Use the ROW_NUMBER() function. It is excellent for web-pages with a << Prev and Next >> style browsing:
Syntax:
SELECT
*
FROM
(
SELECT
ROW_NUMBER () OVER (ORDER BY MyColumnToOrderBy) AS RowNum,
*
FROM
Table_1
) sub
WHERE
RowNum = 23

I suspect this is wildly inefficient but is quite a simple approach, which worked on a small dataset that I tried it on.
select top 1 field
from table
where field in (select top 5 field from table order by field asc)
order by field desc
This would get the 5th item, change the second top number to get a different nth item
SQL server only (I think) but should work on older versions that do not support ROW_NUMBER().

Verify it on SQL Server:
Select top 10 * From emp
EXCEPT
Select top 9 * From emp
This will give you 10th ROW of emp table!

Contrary to what some of the answers claim, the SQL standard is not silent regarding this subject.
Since SQL:2003, you have been able to use "window functions" to skip rows and limit result sets.
And in SQL:2008, a slightly simpler approach had been added, using
OFFSET skip ROWS
FETCH FIRST n ROWS ONLY
Personally, I don't think that SQL:2008's addition was really needed, so if I were ISO, I would have kept it out of an already rather large standard.

1 small change: n-1 instead of n.
select *
from thetable
limit n-1, 1

SQL SERVER
Select n' th record from top
SELECT * FROM (
SELECT
ID, NAME, ROW_NUMBER() OVER(ORDER BY ID) AS ROW
FROM TABLE
) AS TMP
WHERE ROW = n
select n' th record from bottom
SELECT * FROM (
SELECT
ID, NAME, ROW_NUMBER() OVER(ORDER BY ID DESC) AS ROW
FROM TABLE
) AS TMP
WHERE ROW = n

When we used to work in MSSQL 2000, we did what we called the "triple-flip":
EDITED
DECLARE #InnerPageSize int
DECLARE #OuterPageSize int
DECLARE #Count int
SELECT #Count = COUNT(<column>) FROM <TABLE>
SET #InnerPageSize = #PageNum * #PageSize
SET #OuterPageSize = #Count - ((#PageNum - 1) * #PageSize)
IF (#OuterPageSize < 0)
SET #OuterPageSize = 0
ELSE IF (#OuterPageSize > #PageSize)
SET #OuterPageSize = #PageSize
DECLARE #sql NVARCHAR(8000)
SET #sql = 'SELECT * FROM
(
SELECT TOP ' + CAST(#OuterPageSize AS nvarchar(5)) + ' * FROM
(
SELECT TOP ' + CAST(#InnerPageSize AS nvarchar(5)) + ' * FROM <TABLE> ORDER BY <column> ASC
) AS t1 ORDER BY <column> DESC
) AS t2 ORDER BY <column> ASC'
PRINT #sql
EXECUTE sp_executesql #sql
It wasn't elegant, and it wasn't fast, but it worked.

In Oracle 12c, You may use OFFSET..FETCH..ROWS option with ORDER BY
For example, to get the 3rd record from top:
SELECT *
FROM sometable
ORDER BY column_name
OFFSET 2 ROWS FETCH NEXT 1 ROWS ONLY;

Here is a fast solution of your confusion.
SELECT * FROM table ORDER BY `id` DESC LIMIT N, 1
Here You may get Last row by Filling N=0, Second last by N=1, Fourth Last By Filling N=3 and so on.
This is very common question over the interview and this is Very simple ans of it.
Further If you want Amount, ID or some Numeric Sorting Order than u may go for CAST function in MySQL.
SELECT DISTINCT (`amount`)
FROM cart
ORDER BY CAST( `amount` AS SIGNED ) DESC
LIMIT 4 , 1
Here By filling N = 4 You will be able to get Fifth Last Record of Highest Amount from CART table. You can fit your field and table name and come up with solution.

ADD:
LIMIT n,1
That will limit the results to one result starting at result n.

Oracle:
select * from (select foo from bar order by foo) where ROWNUM = x

For example, if you want to select every 10th row in MSSQL, you can use;
SELECT * FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY ColumnName1 ASC) AS rownumber, ColumnName1, ColumnName2
FROM TableName
) AS foo
WHERE rownumber % 10 = 0
Just take the MOD and change number 10 here any number you want.

For SQL Server, a generic way to go by row number is as such:
SET ROWCOUNT #row --#row = the row number you wish to work on.
For Example:
set rowcount 20 --sets row to 20th row
select meat, cheese from dbo.sandwich --select columns from table at 20th row
set rowcount 0 --sets rowcount back to all rows
This will return the 20th row's information. Be sure to put in the rowcount 0 afterward.

Here's a generic version of a sproc I recently wrote for Oracle that allows for dynamic paging/sorting - HTH
-- p_LowerBound = first row # in the returned set; if second page of 10 rows,
-- this would be 11 (-1 for unbounded/not set)
-- p_UpperBound = last row # in the returned set; if second page of 10 rows,
-- this would be 20 (-1 for unbounded/not set)
OPEN o_Cursor FOR
SELECT * FROM (
SELECT
Column1,
Column2
rownum AS rn
FROM
(
SELECT
tbl.Column1,
tbl.column2
FROM MyTable tbl
WHERE
tbl.Column1 = p_PKParam OR
tbl.Column1 = -1
ORDER BY
DECODE(p_sortOrder, 'A', DECODE(p_sortColumn, 1, Column1, 'X'),'X'),
DECODE(p_sortOrder, 'D', DECODE(p_sortColumn, 1, Column1, 'X'),'X') DESC,
DECODE(p_sortOrder, 'A', DECODE(p_sortColumn, 2, Column2, sysdate),sysdate),
DECODE(p_sortOrder, 'D', DECODE(p_sortColumn, 2, Column2, sysdate),sysdate) DESC
))
WHERE
(rn >= p_lowerBound OR p_lowerBound = -1) AND
(rn <= p_upperBound OR p_upperBound = -1);

But really, isn't all this really just parlor tricks for good database design in the first place? The few times I needed functionality like this it was for a simple one off query to make a quick report. For any real work, using tricks like these is inviting trouble. If selecting a particular row is needed then just have a column with a sequential value and be done with it.

Nothing fancy, no special functions, in case you use Caché like I do...
SELECT TOP 1 * FROM (
SELECT TOP n * FROM <table>
ORDER BY ID Desc
)
ORDER BY ID ASC
Given that you have an ID column or a datestamp column you can trust.

For SQL server, the following will return the first row from giving table.
declare #rowNumber int = 1;
select TOP(#rowNumber) * from [dbo].[someTable];
EXCEPT
select TOP(#rowNumber - 1) * from [dbo].[someTable];
You can loop through the values with something like this:
WHILE #constVar > 0
BEGIN
declare #rowNumber int = #consVar;
select TOP(#rowNumber) * from [dbo].[someTable];
EXCEPT
select TOP(#rowNumber - 1) * from [dbo].[someTable];
SET #constVar = #constVar - 1;
END;

LIMIT n,1 doesn't work in MS SQL Server. I think it's just about the only major database that doesn't support that syntax. To be fair, it isn't part of the SQL standard, although it is so widely supported that it should be. In everything except SQL server LIMIT works great. For SQL server, I haven't been able to find an elegant solution.

In Sybase SQL Anywhere:
SELECT TOP 1 START AT n * from table ORDER BY whatever
Don't forget the ORDER BY or it's meaningless.

T-SQL - Selecting N'th RecordNumber from a Table
select * from
(select row_number() over (order by Rand() desc) as Rno,* from TableName) T where T.Rno = RecordNumber
Where RecordNumber --> Record Number to Select
TableName --> To be Replaced with your Table Name
For e.g. to select 5 th record from a table Employee, your query should be
select * from
(select row_number() over (order by Rand() desc) as Rno,* from Employee) T where T.Rno = 5

SELECT
top 1 *
FROM
table_name
WHERE
column_name IN (
SELECT
top N column_name
FROM
TABLE
ORDER BY
column_name
)
ORDER BY
column_name DESC
I've written this query for finding Nth row.
Example with this query would be
SELECT
top 1 *
FROM
Employee
WHERE
emp_id IN (
SELECT
top 7 emp_id
FROM
Employee
ORDER BY
emp_id
)
ORDER BY
emp_id DESC

I'm a bit late to the party here but I have done this without the need for windowing or using
WHERE x IN (...)
SELECT TOP 1
--select the value needed from t1
[col2]
FROM
(
SELECT TOP 2 --the Nth row, alter this to taste
UE2.[col1],
UE2.[col2],
UE2.[date],
UE2.[time],
UE2.[UID]
FROM
[table1] AS UE2
WHERE
UE2.[col1] = ID --this is a subquery
AND
UE2.[col2] IS NOT NULL
ORDER BY
UE2.[date] DESC, UE2.[time] DESC --sorting by date and time newest first
) AS t1
ORDER BY t1.[date] ASC, t1.[time] ASC --this reverses the order of the sort in t1
It seems to work fairly fast although to be fair I only have around 500 rows of data
This works in MSSQL

SELECT * FROM emp a
WHERE n = (
SELECT COUNT( _rowid)
FROM emp b
WHERE a. _rowid >= b. _rowid
);

unbelievable that you can find a SQL engine executing this one ...
WITH sentence AS
(SELECT
stuff,
row = ROW_NUMBER() OVER (ORDER BY Id)
FROM
SentenceType
)
SELECT
sen.stuff
FROM sentence sen
WHERE sen.row = (ABS(CHECKSUM(NEWID())) % 100) + 1

select * from
(select * from ordered order by order_id limit 100) x order by
x.order_id desc limit 1;
First select top 100 rows by ordering in ascending and then select last row by ordering in descending and limit to 1. However this is a very expensive statement as it access the data twice.

It seems to me that, to be efficient, you need to 1) generate a random number between 0 and one less than the number of database records, and 2) be able to select the row at that position. Unfortunately, different databases have different random number generators and different ways to select a row at a position in a result set - usually you specify how many rows to skip and how many rows you want, but it's done differently for different databases. Here is something that works for me in SQLite:
select *
from Table
limit abs(random()) % (select count(*) from Words), 1;
It does depend on being able to use a subquery in the limit clause (which in SQLite is LIMIT <recs to skip>,<recs to take>) Selecting the number of records in a table should be particularly efficient, being part of the database's meta data, but that depends on the database's implementation. Also, I don't know if the query will actually build the result set before retrieving the Nth record, but I would hope that it doesn't need to. Note that I'm not specifying an "order by" clause. It might be better to "order by" something like the primary key, which will have an index - getting the Nth record from an index might be faster if the database can't get the Nth record from the database itself without building the result set.

Most suitable answer I have seen on this article for sql server
WITH myTableWithRows AS (
SELECT (ROW_NUMBER() OVER (ORDER BY myTable.SomeField)) as row,*
FROM myTable)
SELECT * FROM myTableWithRows WHERE row = 3

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

PostgreSQL equivalent for MySQL GROUP BY - mysql

Related

Is it possible to query MySQL to get only fields that contain duplicate/repeating strings?

Multiple select in select

How to use FIND_IN_SET using list of data

How to get fifth highest salary from salary table by single query? [duplicate]

How to select the nth row in a SQL database table?

Categories

Resources