MySQL - Table Query Inner Joining to itself - mysql

Consider the above query result,
Is there a way I can join the table itself to get the following results:-
POH_ID | JOH_ID | .............
-------------------------------------------
NULL | JOH_00000002 | .............
POH_00000002 | JOH_00000001 | .............
POH_00000001 | JOH_00000001 | .............
Meaning, if there's only a single JOH_ID, I retrieve that particular row, if there's more than one of the same JOH_ID, I retrieve the ones with POH_ID.
The result in the photo is a result of a query

You could find count of rows with same joh_id, join it with main table to filter the rows which have either only one row per joh_id or non-null poh_id
select t.*
from your_table t
join (
select joh_id, count(*) as cnt
from your_table
group by joh_id
) t2 on t.joh_id = t2.joh_id
where t2.cnt = 1 or t.poh_id is not null;

Related

getting data from multiple tables and applying arithmatic operation on the result

I want to fetch data from two table and apply arithmetic operation on the column.
This is wha I tried :
String sql = "SELECT SUM(S.san_recover-C.amount) as total
FROM sanction S
LEFT JOIN collection C ON S.client_id = C.client_id
WHERE S.client_id=?";
This code is working only when there is value in both tables, but if there is no value in one of two tables there is no result.
SELECT SUM(S.san_recover - C.amount) as total
FROM sanction S
LEFT JOIN collection C ON S.client_id = C.client_id
WHERE S.client_id = ?
The problem with your query lies in the SUM() function. When the left join does not bring back records, then c.amount is NULL. When substracting NULL from something, you get a NULL result, which then propagates across the computation, and you end up with a NULL result for the SUM().
You probably want COALESCE(), like so:
SELECT SUM(S.san_recover - COALESCE(C.amount, 0)) as total
FROM sanction S
LEFT JOIN collection C ON S.client_id = C.client_id
WHERE S.client_id = ?
Where there is a possibility that a client may exist in one table but no another a full join would be appropriate but since mysql does not have such a thing then a union in a sub query will do
drop table if exists sanctions,collections;
create table sanctions(client_id int, amount int);
create table collections(client_id int, amount int);
insert into sanctions values
(1,10),(1,10),(2,10);
insert into collections values
(1,5),(3,10);
Select sum(Samount - camount)
From
(Select sum(amount) Samount, 0 as camount from sanctions where client_id =3
Union all
Select 0,sum(amount) as camount from collections where client_id =3
) s
;
+------------------------+
| sum(Samount - camount) |
+------------------------+
| -10 |
+------------------------+
1 row in set (0.00 sec)
If you want to do this for all clients
Select client_id,sum(Samount - camount) net
From
(Select client_id,sum(amount) Samount, 0 as camount from sanctions group by client_id
Union all
Select client_id,0,sum(amount) as camount from collections group by client_id
) s
group by client_id
;
+-----------+------+
| client_id | net |
+-----------+------+
| 1 | 15 |
| 2 | 10 |
| 3 | -10 |
+-----------+------+
3 rows in set (0.00 sec)

SQL writing custom query

I need to write a SQL Query which generates the name of the most popular story for each user (according to total reading counts). Here is some sample data:
story_name | user | age | reading_counts
-----------|-------|-----|---------------
story 1 | user1 | 4 | 12
story 2 | user2 | 6 | 14
story 4 | user1 | 4 | 15
This is what I have so far but I don't think it's correct:
Select *
From mytable
where (story_name,reading_counts)
IN (Select id, Max(reading_counts)
FROM mytable
Group BY user
)
In a Derived Table, you can first determine the maximum reading_counts for every user (Group By with Max())
Now, simply join this result-set to the main table on user and reading_counts, to get the row corresponding to maximum reading_counts for a user.
Try the following query:
SELECT
t1.*
FROM mytable AS t1
JOIN
(
SELECT t2.user,
MAX(t2.reading_counts) AS max_count
FROM mytable AS t2
GROUP BY t2.user
) AS dt
ON dt.user = t1.user AND
dt.max_count = t1.reading_counts
SELECT *
FROM mytable
WHERE user IN
(SELECT user, max(reading_counts)
FROM mytable
GROUP BY user)

MySQL query to select rows matching criteria and rows related to the matching rows

I would like to select all rows where a column matches a criteria, but also select all rows that don't match the criteria, but have a relation to the rows that do match the criteria.
Given a table structure like this:
group_id | word
---------+------
1 | the
2 | cat
2 | sat
3 | on
1 | the
3 | mat
Given the criteria WHERE word LIKE '%at%', I'd want to get the matching rows
2 | cat
2 | sat
3 | mat
but I also want to get the related rows. That is, rows with a group_id equalling the group_id of any row matching the criteria, which in this case would be group_id 2 or 3. The final result should be:
2 | cat
2 | sat
3 | on
3 | mat
I think that a self join is the way to go, but I can't quite figure it out.
One method uses in:
select t.*
from t
where t.group_id in (select t2.group_id
from t t2
where t2.word LIKE '%at%'
);
If you try to do the same thing using join, you might get duplicate results.
I think I've figured out how to do it with a self join.
SELECT DISTINCT `t1`.*
FROM `test` AS `t1` JOIN `test` AS `t2`
ON `t1`.`group_id`=`t2`.`group_id`
WHERE `t2`.`text` LIKE '%at%'
I don't know if there is a better (more efficient) way to do this query.
You can join to a derived table that returns all selected group_id values:
SELECT t1.*
FROM mytable AS t1
JOIN (SELECT DISTINCT group_id
FROM mytable
WHERE word LIKE '%at%'
) AS t2 ON t1.group_id = t2.group_id
You have to use DISTINCT in the subquery, so as to be sure that you get one row per group_id, so that the final result doesn't contain any duplicates.

Trying to delete duplicate rows based on a hash in MySQL

I'm trying to delete duplicate values (which will all have the same nid) based on the hash value.
I'm going to leave the initial (oldest) nid row with the same hash.
For some reason, I get the error, "You can't specify target table 'node_revision' for update in FROM clause
I'm trying to alias my tables, but that doesn't seem to work - what am I doing wrong?
delete from node_revision
WHERE nid NOT IN(SELECT MIN(nid) FROM node_revision GROUP BY hash)
(timestamp is just for illustration, don't actually want this used in any queries)
| nid | hash | timestamp |
| 2 | 123456 | 123364600 |
| 2 | 123456 | 123364601 |
| 2 | 1234567 | 123364602 |
Rows 1, and 3 would survive in this case.
You can phrase this as a left join:
delete nr from node_revision nr left join
(SELECT MIN(nid) as minnid
FROM node_revision
GROUP BY hash
) nrkeep
on nr.nid = nrkeep.minnid
where nrkeep.minnid is null;
You can also "trick" MySQL into using the subquery:
DELETE FROM node_revision
WHERE nid NOT IN (SELECT minnid
FROM (SELECT MIN(nid) as minnid FROM node_revision GROUP BY hash
) t
);
MySQL has a well-documented limitation on using the modified table in update and delete statements. This query gets around the limitation by actually materializing the list of minnids by using a subquery.
EDIT:
Based on the example now in the question, you should use timestamp as follows:
delete nr from node_revision nr left join
(SELECT hash, nid, min(timestamp) as mintimestamp
FROM node_revision
GROUP BY hash
) nrkeep
on nr.hash = nrkeep.hash and
nr.nid = nrkeep.nid and
nr.timestamp = nrkeep.mintimestamp
where nrkeep.minnid is null;

Using ORDER BY and GROUP BY together

My table looks like this (and I'm using MySQL):
m_id | v_id | timestamp
------------------------
6 | 1 | 1333635317
34 | 1 | 1333635323
34 | 1 | 1333635336
6 | 1 | 1333635343
6 | 1 | 1333635349
My target is to take each m_id one time, and order by the highest timestamp.
The result should be:
m_id | v_id | timestamp
------------------------
6 | 1 | 1333635349
34 | 1 | 1333635336
And i wrote this query:
SELECT * FROM table GROUP BY m_id ORDER BY timestamp DESC
But, the results are:
m_id | v_id | timestamp
------------------------
34 | 1 | 1333635323
6 | 1 | 1333635317
I think it causes because it first does GROUP_BY and then ORDER the results.
Any ideas? Thank you.
One way to do this that correctly uses group by:
select l.*
from table l
inner join (
select
m_id, max(timestamp) as latest
from table
group by m_id
) r
on l.timestamp = r.latest and l.m_id = r.m_id
order by timestamp desc
How this works:
selects the latest timestamp for each distinct m_id in the subquery
only selects rows from table that match a row from the subquery (this operation -- where a join is performed, but no columns are selected from the second table, it's just used as a filter -- is known as a "semijoin" in case you were curious)
orders the rows
If you really don't care about which timestamp you'll get and your v_id is always the same for a given m_i you can do the following:
select m_id, v_id, max(timestamp) from table
group by m_id, v_id
order by max(timestamp) desc
Now, if the v_id changes for a given m_id then you should do the following
select t1.* from table t1
left join table t2 on t1.m_id = t2.m_id and t1.timestamp < t2.timestamp
where t2.timestamp is null
order by t1.timestamp desc
Here is the simplest solution
select m_id,v_id,max(timestamp) from table group by m_id;
Group by m_id but get max of timestamp for each m_id.
You can try this
SELECT tbl.* FROM (SELECT * FROM table ORDER BY timestamp DESC) as tbl
GROUP BY tbl.m_id
SQL>
SELECT interview.qtrcode QTR, interview.companyname "Company Name", interview.division Division
FROM interview
JOIN jobsdev.employer
ON (interview.companyname = employer.companyname AND employer.zipcode like '100%')
GROUP BY interview.qtrcode, interview.companyname, interview.division
ORDER BY interview.qtrcode;
I felt confused when I tried to understand the question and answers at first. I spent some time reading and I would like to make a summary.
The OP's example is a little bit misleading.
At first I didn't understand why the accepted answer is the accepted answer.. I thought that the OP's request could be simply fulfilled with
select m_id, v_id, max(timestamp) as max_time from table
group by m_id, v_id
order by max_time desc
Then I took a second look at the accepted answer. And I found that actually the OP wants to express that, for a sample table like:
m_id | v_id | timestamp
------------------------
6 | 1 | 11
34 | 2 | 12
34 | 3 | 13
6 | 4 | 14
6 | 5 | 15
he wants to select all columns based only on (group by)m_id and (order by)timestamp.
Then the above sql won't work. If you still don't get it, imagine you have more columns than m_id | v_id | timestamp, e.g m_id | v_id | timestamp| columnA | columnB |column C| .... With group by, you can only select those "group by" columns and aggreate functions in the result.
By far, you should have understood the accepted answer.
What's more, check row_number function introduced in MySQL 8.0:
https://www.mysqltutorial.org/mysql-window-functions/mysql-row_number-function/
Finding top N rows of every group
It does the simlar thing as the accepted answer.
Some answers are wrong. My MySQL gives me error.
select m_id,v_id,max(timestamp) from table group by m_id;
#abinash sahoo
SELECT m_id,v_id,MAX(TIMESTAMP) AS TIME
FROM table_name
GROUP BY m_id
#Vikas Garhwal
Error message:
[42000][1055] Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'testdb.test_table.v_id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
Why make it so complicated? This worked.
SELECT m_id,v_id,MAX(TIMESTAMP) AS TIME
FROM table_name
GROUP BY m_id
Just you need to desc with asc. Write the query like below. It will return the values in ascending order.
SELECT * FROM table GROUP BY m_id ORDER BY m_id asc;