how to group by first letter of text - mysql

I have the following query in MySQL:
select val,count(val)
from ....
where ...
group by val
It gives:
val count
CE3 4
CE5 1
A3 12
BRICK4 5
BRICK2 2
I want to show only the row with the highest count per first letter.
Which means all val starting with A are one group, all val starting with B are another group etc...
The expected result is:
val count
CE3 4 / CE3 CE5 are in group C , CE3 has higher count
A3 12 / A3 is the only one in group A
BRICK4 5 / BRICK4 BRICK2 are in group B, BRICK4 has higher count
How can I do that?
Edit:
what I thought to do is to create a temp column in a query that will represent the group something like:
val count group
CE3 4 C
CE5 1 C
A3 12 A
BRICK4 5 B
BRICK2 2 B
and then search for the row with the highest count value per group.
But i'm not sure this is the best approach

Try something like this:
select
val
,MAX(count) as count
,left(val,1) as first_letter
from (
select
val
,count(val) as count
from tbl
group by val
) a
group by left(val, 1);
First get count per val and from this result get the MAXcount grouping by first letter
UPDATE: (thx to Vamsi Prabhala for pointing it out that my first solution wasn't the best one)
After get the count per val, I used a variable to redo the ROW_NUMBER() functionality (from MS-SQL) and select the first row from result, ordered by first_letter and count desc
select val, count, first_letter from (
select
#i:=CASE
WHEN #first_letter = first_letter THEN #i + 1
ELSE 1
END as rn
,#first_letter:= a.first_letter as First_letter
,a.val
,a.count
from (
select
val
,count(val) as count
,left(val,1) as first_letter
from tbl
group by val
)a, (select #i:=0) b
order by First_letter, count desc
) c
where rn = 1

This can be done with variables to rank the rows based on counts.
select val,val_cnt
from (
select val,val_cnt,#rn:=case when #prev=left(val,1) then #rn+1 else 1 end as rnum,
#prev:=left(val,1)
from (select val,count(val) as val_cnt
from ....
where ...
group by val
) t
cross join (select #rn:=0,#prev:='') r
order by left(val,1),val_cnt desc,val --added val to order by to break ties
) t
where rnum=1

Not sure it's what you want :
Let's get the max from each starting letter :
SELECT LEFT(val, 1) as l_val, MAX(count_val)
FROM (
select val,count(val) as count_val
from ....
where ...
group by val
) t
GROUP BY l_val
THEN as you want the val to appear :
SELECT t2.val, t1.max_val
FROM (
SELECT LEFT(val, 1) as l_val, MAX(count_val) as max_val
FROM (
select val,count(val) as count_val
from ....
where ...
group by val
) t
GROUP BY l_val) t1
INNER JOIN `YOUR_table` t2 ON LEFT(t2.val,1) = t1.l_val

Related

Filter out records based on condition SQL

I am trying to filter out some records based on condition but couldn't get the proper results.
Data:
GID OID SID Z
1 1 1 A
1 2 2 B
1 3 3 C
1 2 4 B
Expected Result:
GID OID SID Z
1 1 1 A
1 3 3 C
Here GID, OID can be repeated but not SID.
Need to filter out all records where Z contains 'A' & 'C'
What I have tried:
select distinct GID, OID, SID, Z
from table
where Z ilike ('A') or Z ilike ('C')
but this query will include all record of sample GID records.
Moreover I have also thought of self join but could not frame the query around that.
I think this is the query you need
WITH cte AS (
SELECT
gid, oid, sid, z,
ROW_NUMBER() OVER () AS rn
FROM data
WHERE LOWER(z) LIKE '%a%'
OR LOWER(Z) LIKE '%c%'
)
SELECT gid, oid, sid, z FROM cte c
WHERE NOT EXISTS (
SELECT sid FROM cte t
WHERE t.z = c.z
AND t.sid = c.sid
AND t.rn < c.rn
)
I use ROW_NUMBER to be able to check if a sid value repeats.
Demo

Subtracting rows based on condition

Suppose I have a table that looks like this:
OrderNumber OrderType
1 D
1 D
1 R
2 D
2 R
3 D
3 D
3 D
3 R
3 R
The result should be:
OrderNumber OrderType
1 D
3 D
Here, an R would indicate to remove one row from the order. We see in the first example we have 2 D's and 1 R, so we remove one D are replaced with 1 D. Is there a way to do this in SQL?
If your mysql version support cte and window function, we can try to use ROW_NUMBER window function make row number for each OrderNumber OrderType
Then use EXISTS subquery to judge OrderType = D row number needs to be greater than the maximum row number from R.
with cte as (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY OrderNumber,OrderType) rn,
COUNT(*) OVER(PARTITION BY OrderNumber,OrderType) cnt
FROM T
)
SELECT c1.OrderNumber,
c1.OrderType
FROM cte c1
WHERE EXISTS (
SELECT 1
FROM cte c2
WHERE c1.OrderNumber = c2.OrderNumber
AND c2.OrderType = 'R'
AND c1.rn > c2.cnt
)
AND c1.OrderType = 'D'
sqlfiddle
You can use window function. This is sqlite syntax, but mysql should be fairly close.
select A.OrderNumber,A.OrderType
from (select OrderNumber,OrderType,row_number() over(partition by OrderNumber) as RN
from b where OrderType='D') A
left join
(select sum(case when OrderType='R' then 1 else 0 end) as cnt,OrderNumber
from b group by OrderNumber) B
on A.OrderNumber=B.OrderNumber
where A.rn>B.cnt;

SQL: SUM selected Rows [duplicate]

How can you select the top n max values from a table?
For a table like this:
column1 column2
1 foo
2 foo
3 foo
4 foo
5 bar
6 bar
7 bar
8 bar
For n=2, the result needs to be:
3
4
7
8
The approach below selects only the max value for each group.
SELECT max(column1) FROM table GROUP BY column2
Returns:
4
8
For n=2 you could
SELECT max(column1) m
FROM table t
GROUP BY column2
UNION
SELECT max(column1) m
FROM table t
WHERE column1 NOT IN (SELECT max(column1)
WHERE column2 = t.column2)
for any n you could use approaches described here to simulate rank over partition.
EDIT:
Actually this article will give you exactly what you need.
Basically it is something like this
SELECT t.*
FROM
(SELECT grouper,
(SELECT val
FROM table li
WHERE li.grouper = dlo.grouper
ORDER BY
li.grouper, li.val DESC
LIMIT 2,1) AS mid
FROM
(
SELECT DISTINCT grouper
FROM table
) dlo
) lo, table t
WHERE t.grouper = lo.grouper
AND t.val > lo.mid
Replace grouper with the name of the column you want to group by and val with the name of the column that hold the values.
To work out how exactly it functions go step-by-step from the most inner query and run them.
Also, there is a slight simplification - the subquery that finds the mid can return NULL if certain category does not have enough values so there should be COALESCE of that to some constant that would make sense in the comparison (in your case it would be MIN of domain of the val, in article it is MAX).
EDIT2:
I forgot to mention that it is the LIMIT 2,1 that determines the n (LIMIT n,1).
If you are using mySQl, why don't you use the LIMIT functionality?
Sort the records in descending order and limit the top n i.e. :
SELECT yourColumnName FROM yourTableName
ORDER BY Id desc
LIMIT 0,3
Starting from MySQL 8.0/MariaDB support window functions which are designed for this kind of operations:
SELECT *
FROM (SELECT *,ROW_NUMBER() OVER(PARTITION BY column2 ORDER BY column1 DESC) AS r
FROM tab) s
WHERE r <= 2
ORDER BY column2 DESC, r DESC;
DB-Fiddle.com Demo
This is how I'm getting the N max rows per group in MySQL
SELECT co.id, co.person, co.country
FROM person co
WHERE (
SELECT COUNT(*)
FROM person ci
WHERE co.country = ci.country AND co.id < ci.id
) < 1
;
how it works:
self join to the table
groups are done by co.country = ci.country
N elements per group are controlled by ) < 1 so for 3 elements - ) < 3
to get max or min depends on: co.id < ci.id
co.id < ci.id - max
co.id > ci.id - min
Full example here:
mysql select n max values per group/
mysql select max and return multiple values
Note: Have in mind that additional constraints like gender = 0 should be done in both places. So if you want to get males only, then you should apply constraint on the inner and the outer select

Exclude top and bottom n rows in SQL

I'm trying to query a database but excluding the first and last rows from the table. Here's a sample table:
id | val
--------
1 1
2 9
3 3
4 1
5 2
6 6
7 4
In the above example, I'd first like to order it by val and then exclude the first and last rows for the query.
id | val
--------
4 1
5 2
3 3
7 4
6 6
This is the resulting set I would like. Note row 1 and 2 were excluded as they had the lowest and highest val respectively.
I've considered LIMIT, TOP, and a couple of other things but can't get my desired result. If there's a method to do it (even better with first/last % rather than first/last n), I can't figure it out.
You can try this mate:
SELECT * FROM numbers
WHERE id NOT IN (
SELECT id FROM numbers
WHERE val IN (
SELECT MAX(val) FROM numbers
) OR val IN (
SELECT MIN(val) FROM numbers
)
);
You can try this:
Select *
from table
where
val!=(select val from table order by val asc LIMIT 1)
and
val!=(select val from table order by val desc LIMIT 1)
order by val asc;
You can also use UNION and avoid the 2 val!=(query)
;WITH cte (id, val, rnum, qty) AS (
SELECT id
, val
, ROW_NUMBER() OVER(ORDER BY val, id)
, COUNT(*) OVER ()
FROM t
)
SELECT id
, val
FROM cte
WHERE rnum BETWEEN 2 AND qty - 1
What if you use UNION and exclude the val you don't want. Something like below
select * from your_table
where val not in (
select top 1 val from your_table order by val
union
select top 1 val from your_table order by val desc)

mysql select top n max values

How can you select the top n max values from a table?
For a table like this:
column1 column2
1 foo
2 foo
3 foo
4 foo
5 bar
6 bar
7 bar
8 bar
For n=2, the result needs to be:
3
4
7
8
The approach below selects only the max value for each group.
SELECT max(column1) FROM table GROUP BY column2
Returns:
4
8
For n=2 you could
SELECT max(column1) m
FROM table t
GROUP BY column2
UNION
SELECT max(column1) m
FROM table t
WHERE column1 NOT IN (SELECT max(column1)
WHERE column2 = t.column2)
for any n you could use approaches described here to simulate rank over partition.
EDIT:
Actually this article will give you exactly what you need.
Basically it is something like this
SELECT t.*
FROM
(SELECT grouper,
(SELECT val
FROM table li
WHERE li.grouper = dlo.grouper
ORDER BY
li.grouper, li.val DESC
LIMIT 2,1) AS mid
FROM
(
SELECT DISTINCT grouper
FROM table
) dlo
) lo, table t
WHERE t.grouper = lo.grouper
AND t.val > lo.mid
Replace grouper with the name of the column you want to group by and val with the name of the column that hold the values.
To work out how exactly it functions go step-by-step from the most inner query and run them.
Also, there is a slight simplification - the subquery that finds the mid can return NULL if certain category does not have enough values so there should be COALESCE of that to some constant that would make sense in the comparison (in your case it would be MIN of domain of the val, in article it is MAX).
EDIT2:
I forgot to mention that it is the LIMIT 2,1 that determines the n (LIMIT n,1).
If you are using mySQl, why don't you use the LIMIT functionality?
Sort the records in descending order and limit the top n i.e. :
SELECT yourColumnName FROM yourTableName
ORDER BY Id desc
LIMIT 0,3
Starting from MySQL 8.0/MariaDB support window functions which are designed for this kind of operations:
SELECT *
FROM (SELECT *,ROW_NUMBER() OVER(PARTITION BY column2 ORDER BY column1 DESC) AS r
FROM tab) s
WHERE r <= 2
ORDER BY column2 DESC, r DESC;
DB-Fiddle.com Demo
This is how I'm getting the N max rows per group in MySQL
SELECT co.id, co.person, co.country
FROM person co
WHERE (
SELECT COUNT(*)
FROM person ci
WHERE co.country = ci.country AND co.id < ci.id
) < 1
;
how it works:
self join to the table
groups are done by co.country = ci.country
N elements per group are controlled by ) < 1 so for 3 elements - ) < 3
to get max or min depends on: co.id < ci.id
co.id < ci.id - max
co.id > ci.id - min
Full example here:
mysql select n max values per group/
mysql select max and return multiple values
Note: Have in mind that additional constraints like gender = 0 should be done in both places. So if you want to get males only, then you should apply constraint on the inner and the outer select