Identifying groups in Group By

Identifying groups in Group By - mysql

I am running a complicated group by statement and I get all my results in their respective groups. But I want to create a custom column with their "group id". Essentially all the items that are grouped together would share an ID.
This is what I get:
partID | Description
-------+---------+--
11000 | "Oven"
12000 | "Oven"
13000 | "Stove"
13020 | "Stove"
12012 | "Grill"
This is what I want:
partID | Description | GroupID
-------+-------------+----------
11000 | "Oven" | 1
12000 | "Oven" | 1
13000 | "Stove" | 2
13020 | "Stove" | 2
12012 | "Grill" | 3
"GroupID" does not exist as data in any of the tables, it would be a custom generated column (alias) that would be associated to that group's key,id,index, whatever it would be called.
How would I go about doing this?

I think this is the query that returns the five rows:
select partId, Description
from part p;
Here is one way (using standard SQL) to get the groups:
select partId, Description,
(select count(distinct Description)
from part p2
where p2.Description <= p.Description
) as GroupId
from part p;
This is using a correlated subquery. The subquery is finding all the description values less than the current one -- and counting the distinct values. Note that this gives a different set of values from the ones in the OP. These will be alphabetically assigned rather than assigned by first encounter in the data. If that is important, the OP should add that into the question. Based on the question, the particular ordering did not seem important.

Here's one way to get it:
SELECT p.partID,p.Description,b.groupID
FROM (
SELECT Description,#rn := #rn + 1 AS groupID
FROM (
SELECT distinct description
FROM part,(SELECT #rn:= 0) c
) a
) b
INNER JOIN part p ON p.description = b.description;
sqlfiddle demo
This gets assigns a diferent groupID to each description, and then joins the original table by that description.

Based on your comments in response to Gordon's answer, I think what you need is a derived table to generate your groupids, like so:
select
t1.description,
#cntr := #cntr + 1 as GroupID
FROM
(select distinct table1.description from table1) t1
cross join
(select #cntr:=0) t2
which will give you:
DESCRIPTION GROUPID
Oven 1
Stove 2
Grill 3
Then you can use that in your original query, joining on description:
select
t1.partid,
t1.description,
t2.GroupID
from
table1 t1
inner join
(
select
t1.description,
#cntr := #cntr + 1 as GroupID
FROM
(select distinct table1.description from table1) t1
cross join
(select #cntr:=0) t2
) t2
on t1.description = t2.description
SQL Fiddle

SELECT partID , Description, #s:=#s+1 GroupID
FROM part, (SELECT #s:= 0) AS s
GROUP BY Description

Related

Calculate percentage in mySQL where SUM is already present in the table

I have a table(Which I have no control over) like this:
As, you can see this already has total calculate in a separate row
I have to do calculate percentage which should look something like this:
The issue is how do I pass Total in a sub query like
SELECT Marks from <TABLE> WHERE Topic = 'Total';
, so that I only get a single row?
Thanks

You can do something along the lines of
SELECT m1.*, ROUND(m1.marks / m2.marks * 100, 2) percentage
FROM marks m1 join marks m2
ON m1.name = m2.name AND m2.topic = 'Total'
ORDER BY name, topic
Output:
| Name | Topic | Marks | percentage |
|------|---------|-------|------------|
| Joe | Chem | 43 | 26.38 |
| Joe | Maths | 75 | 46.01 |
| Joe | Physics | 45 | 27.61 |
| Joe | Total | 163 | 100 |
...
SQLFiddle

The total SHOULD NOT be in the table. Given that you cannot modify it, I would just ignore that value and calculate the total and then calculate the percentage.
SELECT
m.Name,
Topic,
Marks,
Marks / t.Total * 100 AS Percentage
FROM
marks AS m
JOIN (
SELECT
Name,
SUM(Marks) AS Total
FROM
marks
WHERE
Topic != 'Total'
GROUP BY
Name) AS t ON t.Name = m.Name

In a subquery select the row with the same name and the topic 'Total'.
SELECT t1.name,
t1.topic,
t1.marks,
t1.marks
/ (SELECT t2.marks
FROM elbat t2
WHERE t2.name = t1.name
AND t2.topic = 'Total')
* 100 percentage
FROM elbat t1;
Another option is using a join.
SELECT t1.name,
t1.topic,
t1.marks,
t1.marks
/ t2.marks
* 100 percentage
FROM elbat t1
LEFT JOIN elbat t2
ON t2.name = t1.name
AND t2.topic = 'Total';
name is required to be unique and there must only be one row with 'Total' per name. Otherwise the subquery will throw an error about returning more than one row. With the join there's no such error but nonsense/ambiguous results.
You might also think about the case when there's a total of 0, as this would trigger a division by zero error.
The table design alas is bad. Tables represent relations, not spreadsheets. The rows with the total have no business being in there. Lookup relational normalization.

mysql small count issue on same table

Please find db structure as following...
| id | account_number | referred_by |
+----+-----------------+--------------+
| 1 | ac203003 | ac203005 |
+----+-----------------+--------------+
| 2 | ac203004 | ac203005 |
+----+-----------------+--------------+
| 3 | ac203005 | ac203004 |
+----+-----------------+--------------+
I want to achieve following results...
id, account_number, total_referred
1, ac203005, 2
2, ac203003m 0
3, ac203004, 1
And i am using following query...
SELECT id, account_number,
(SELECT count(*) FROM `member_tbl` WHERE referred_by = account_number) AS total_referred
FROM `member_tbl`
GROUP BY id, account_number
but its not giving expected results, please help. thanks.

You need to use table aliases to do this correctly:
SELECT id, account_number,
(SELECT count(*)
FROM `member_tbl` t2
WHERE t2.referred_by = t1.account_number
) AS total_referred
FROM `member_tbl` t1;
Your original query had referred_by = account_number. Without aliases, these would come from the same row -- and the value would be 0.
Also, I removed the outer group by. It doesn't seem necessary, unless you want to remove duplicates.

One idea is to join the table on itself. This way you can avoid the subquery. There might be performance gains with this approach.
select b.id, b.account_number, count(a.referred_by)
from member_tbl a inner join member_tbl b
on a.referred_by=b.account_number
group by (a.referred_by);
SQL fiddle: http://sqlfiddle.com/#!2/b1393/2
Another test, with more data: http://sqlfiddle.com/#!2/8d216/1

select t1.account_number, count(t2.referred_by)
from (select account_number from member_tbl) t1
left join member_tbl t2 on
t1.account_number = t2.referred_by
group by t1.account_number;
Fiddle for your data
Fiddle with more data

What to do with Full Outer Join

I need a Full Outer Join in mysql. I found a solution here: Full Outer Join in MySQL My problem is that t1 and t2 are subqueries themselves. So resulting query looks like a monster.
What to do in this situation? Should I use views instead of subqueries?
Edit:
I'll try to explain a bit more. I have orders and payments. One payment can cower multiple orders, and one order can be cowered by multiple payments. That is why I have tables orders, payments, and paymentitems. Each order has field company (which made this order) and manager (which accepted this order). Now I need to group orders and payments by company and manager and count money. So I want to get something like this:
company1 | managerA | 200 | 200 | 0
company1 | managerB | Null | 100 | 100
company1 | managerC | 300 | Null | -300
company2 | managerA | 150 | Null | -150
company2 | managerB | 100 | 350 | 250
The query, I managed to create:
SELECT coalesce(o.o_company, p.o_company)
, coalesce(o.o_manager, p.o_manager)
, o.orderstotal
, p.paymentstotal
, (coalesce(p.paymentstotal, 0) - coalesce(o.orderstotal, 0)) AS balance
FROM
(((/*Subquery A*/SELECT orders.o_company
, orders.o_manager
, sum(o_money) AS orderstotal
FROM
orders
WHERE
(o_date >= #startdate)
AND (o_date <= #enddate)
GROUP BY
o_company
, o_manager) AS o
LEFT JOIN (/*Subquery B*/SELECT orders.o_company
, orders.o_manager
, sum(paymentitems.p_money) AS paymentstotal
FROM
((payments
INNER JOIN paymentitems
ON payments.p_id = paymentitems.p_id)
INNER JOIN orders
ON paymentitems.p_oid = orders.o_id)
WHERE
(payments.p_date >= #startdate)
AND (payments.p_date <= #enddate)
GROUP BY
orders.o_company
, orders.o_manager) AS p
ON (o.o_company = p.o_company) and (o.o_manager = p.o_manager))
union
(/*Subquery A*/
right join /*Subquery B*/
ON (o.o_company = p.o_company) and (o.o_manager = p.o_manager)))
This is simplified version of my query. Real query is much more complex, that is why I want to keep it as simple as it can be. Maybe even split in to views, or may be there are other options I am not aware of.

I think the clue is in "group orders and payments by company". Break the outer join into a query on orders and another query on payments, then add up the type of money (orders or payments) for each company.

If you are trying to do a full outer join and the relationship is 1-1, then you can accomplish the same thing with a union and aggreagation.
Here is an example, pulling one column from two different tables:
select id, max(col1) as col1, max(col2) as col2
from ((select t1.id, t1.col1, NULL as col2
from t1
) union all
(select t23.id, NULL as col1, t2.col2
from t2
)
) t
group by id

MySQL: Group by date proximity?

I wrote this query, it does almost what I want:
SELECT * FROM
(
SELECT COUNT(*) as cnt,
lat,
lon,
elev,
GROUP_CONCAT(CONCAT(usaf,'-',wban))
FROM `ISH-HISTORY_HASPOS`
GROUP BY lat,lon,elev
) AS x WHERE cnt >=1;
output:
+-----+--------+----------+--------+-------------------------------------------------+
| cnt | lat | lon | elev | GROUP_CONCAT(CONCAT(usaf,'-',wban)) |
+-----+--------+----------+--------+-------------------------------------------------+
| 4 | 30.478 | -87.187 | 36 | 722220-13899,722221-13899,722223-13899,999999-13899 |
| 4 | 36.134 | -80.222 | 295.7 | 723190-93807,723191-93807,723193-93807,999999-93807 |
| 5 | 37.087 | -84.077 | 369.1 | 723290-03849,723291-03849,723293-03849,724243-03849,999999-03849 |
| 5 | 38.417 | -113.017 | 1534.1 | 745200-23176,745201-23176,999999-23176,724757-23176,724797-23176 |
| 4 | 40.217 | -76.851 | 105.8 | 999999-14751,725110-14751,725111-14751,725118-14751 |
+-----+--------+----------+--------+-------------------------------------------------+
This returns a concatenated list of stations that are located at identical coordinates. However, I am only interested in concatenating stations with adjoining date ranges. The table that I select from (ISH-HISTORY_HASPOS) has two datetime columns : 'begin' and 'end'. I need the values for these two columns to be within 3 days of each other to satisfy the GROUP_CONCAT conditions.
Edit: In order for a station to be included in the final result's GROUP_CONCAT it must satisfy the following conditions:
It must be co-located with another station in the list (group by
lat,lon,elev)
Its end time must be within 3 days of another station's begin time OR its begin time must be within 3 days of another station's
end time. When I say "another station", I am referring to stations
that are co-located (meet the conditions for #1).
I figure that I will have to use a subquery but I can't seem to figure out how to do it. Some help would be greatly appreciated! Either a query or a stored procedure would be great but a php solution would also be acceptable.
Here is a dump of the table that I am querying:sql dump
The results should look the same as my example, but non-adjoining items (date-wise) should not be there.

A solution could be using a subquery to compute the list of station within 3 days of each other and adding this subquery as a where clause to the main query.
The subquery consists of a cartesian product to list all possible station couples with a first condition to get just the first half of the resulting matrix and two conditions to specify the time constraints. As to these latter conditions I just guessed them, I don't really know the begin and end fields unit of measure.
The resulting query could be this:
SELECT * FROM (
SELECT COUNT(*) AS
cnt,
lat,
lon,
elev,
GROUP_CONCAT(CONCAT(usaf, '-', wban))
FROM ISH-HISTORY_HASPOS
WHERE id IN (
SELECT DISTINCT t1.id
FROM ISH-HISTORY_HASPOS t1
INNER JOIN ISH-HISTORY_HASPOS t2
ON t1.lon = t2.lon
AND t1.lat = t2.lat
AND t1.elev = t2.elev
WHERE t1.id < t2.id
AND abs(t1.begin - t2.end) < 259200
AND abs(t1.end - t2.begin) < 259200
UNION
SELECT DISTINCT t2.id
FROM ISH-HISTORY_HASPOS t1
INNER JOIN ISH-HISTORY_HASPOS t2
ON t1.lon = t2.lon
AND t1.lat = t2.lat
AND t1.elev = t2.elev
WHERE t1.id < t2.id
AND abs(t1.begin - t2.end) < 259200
AND abs(t1.end - t2.begin) < 259200
)
GROUP BY lat, lon, elev
) AS x WHERE cnt >= 1;

I only have access and knowledge of SQL Server so I can't get your data to work and I don't know if MySQL has the equivalent functionality but here is a verbal description of what you need to do.
You need a recursive statement (WITH CTE in SQL Server) to join the table to itself on lat, lon, elev and begin BETWEEN end -3 AND end +3. You will need to be careful not to get caught in an infinite loop - I suggest building a comma seperated list of the IDs you have visited and checking this as you go. Its painful but keep this list in ID order becuase it is what you will need to group on at the end. You also need to keep track of your depth and the original id.
Something like ...
WITH cte(id, idlist, lat, lon, elev, starts, ends)
AS (
SELECT id, CAST(id AS varchar), lat, lon, elev, starts, ends
FROM `ISH-HISTORY_HASPOS`
UNION ALL
SELECT i.id, FunctionToManagetheList(i.idlist, cte.id), lat, lon, elev, starts, ends
FROM `ISH-HISTORY_HASPOS` i
INNER JOIN
cte ON i.lat=cte.lat AND
i.lon=cte.lon AND
i.elev=cte.elev AND
NOT FunctionToCheckIfTheIDisintheLitst(i.id, cte.idlist)
)
SELECT stuffyouneed
FROM `ISH-HISTORY_HASPOS` i
INNER JOIN
(SELECT id, MAX(depth) AS MaxDepth
FROM cte
GROUP BY id) cte1 ON i.id=cte.id
INNER JOIN
cte cte2 ON cte1.id=cte2.id AND cte1.MaxDepth=cte2.Depth
GROUP BY cte.idlist

MySQL getting the lowest ID for a certain user -or- the ID of the entry with the highest urgency for each row

I have the following database
id | user | urgency | problem | solved
The information in there has different users, but these users all have multiple entries
1 | marco | 0 | MySQL problem | n
2 | marco | 0 | Email problem | n
3 | eddy | 0 | Email problem | n
4 | eddy | 1 | MTV doesn't work | n
5 | frank | 0 | out of coffee | y
What I want to do is this: Normally I would check everybody's oldest problem first. I use this query to get the ID's of the oldest problem.
select min(id) from db group by user
this gives me a list of the oldest problem ID's. But I want people to be able to make a certain problem more urgent. I want the ID with the highest urgency for each user, or ID of the problem with the highest urgency
Getting the max(urgency) won't give the ID of the problem, it will give me the max urgency.
To be clear: I want to get this as a result
row | id
0 | 1
1 | 4
The last entry should be in the results since it's solved

Select ...
From SomeTable As T
Join (
Select T1.User, Min( T1.Id ) As Id
From SomeTable As T1
Join (
Select T2.User, Max( T2.Urgency ) As Urgency
From SomeTable As T2
Where T2.Solved = 'n'
Group By T2.User
) As MaxUrgency
On MaxUrgency.User = T1.User
And MaxUrgency.Urgency = T1.Urgency
Where T1.Solved = 'n'
Group By T1.User
) As Z
On Z.User = T.User
And Z.Id = T.Id

There are lots of esoteric ways to do this, but here's one of the clearer ones.
First build a query go get your min id and max urgency:
SELECT
user,
MIN(id) AS min_id,
MAX(urgency) AS max_urgency
FROM
db
GROUP BY
user
Then incorporate that as a logical table into
a larger query for your answers:
SELECT
user,
min_id,
max_urgency,
( SELECT MIN(id) FROM db
WHERE user = a.user
AND urgency = a.max_urgency
) AS max_urgency_min_id
FROM
(
SELECT
user,
MIN(id) AS min_id,
MAX(urgency) AS max_urgency
FROM
db
GROUP BY
user
) AS a
Given the obvious indexes, this should be pretty efficient.

The following will get you exactly one row back -- the most urgent, probably oldest problem in your table.
select id from my_table where id = (
select min(id) from my_table where urgency = (
select max(urgency) from my_table
)
)
I was about to suggest adding a create_date column to your table so that you could get the oldest problem first for those problems of the same urgency level. But I'm now assuming you're using the lowest ID for that purpose.
But now I see you wanted a list of them. For that, you'd sort the results by ID:
select id from my_table where urgency = (
select max(urgency) from my_table
) order by id;
[Edit: Left out the order by!]
I forget, honestly, how to get the row number. Someone on the interwebs suggests something like this, but no idea if it works:
select #rownum:=#rownum+1 ‘row', id from my_table where ...

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Identifying groups in Group By - mysql

SELECT partID , Description, #s:=#s+1 GroupID FROM part, (SELECT #s:= 0) AS s GROUP BY Description

Related

Calculate percentage in mySQL where SUM is already present in the table

mysql small count issue on same table

What to do with Full Outer Join

MySQL: Group by date proximity?

MySQL getting the lowest ID for a certain user -or- the ID of the entry with the highest urgency for each row

Categories

Resources