Get rows which are related to the searched row, by specific column - mysql

I am trying to implement a sql query to below scenario,
user_id
nic_number
reg_number
full_name
code
B123
12345
1212
John
123
B124
12346
1213
Peter
124
B125
12347
1214
Darln
125
B123
12345
1212
John
126
B123
12345
1212
John
127
In the subscribers table there can be rows with same user_id , nic_number , reg_number , full_name. But the code is different.
First -> get the user who have same code i have typed in the query ( i have implemented a query for that and it is working fine)
Second -> Then in that data i need to find the related rows (check by nic_number, and reg_number) and display only those related rows. That means in the below query I have got the data for code = 123. Which will show the first row of the table.
But I need to display only the rest of the rows which have the same nic_number or reg_number for the searched code only once.
That means the last 2 rows of the table.
select code,
GROUP_CONCAT(distinct trim(nic_number)) as nic_number,
GROUP_CONCAT(distinct trim(reg_number)) as reg_number,
GROUP_CONCAT(distinct trim(full_name)) as full_name from subscribers
where code like lower(concat('123')) group by code;
I need to implement sql query for this scenario by changing the above query.(Only one query, without joins or triggers).
I have tried this for a long time and unable to get the result. If anyone of you help me to get the result it will be very helpful.

You can combine nic and reg numbers in a unique key to get your records.
EDITED
to extract only related rows and not the one searched by code,
by the way, code seems not to be unique in subscribers table.
select
code,
trim(nic_number) as nic_number,
trim(reg_number) as reg_number,
trim(full_name) as full_name,
trim(code) as code
from
subscribers s1
where
code <> lower(trim('123'))
and trim(nic_number) + '|' + trim(reg_number) IN (
select trim(nic_number) + '|' + trim(reg_number)
from subscribers
where code = lower(trim('123'))
)

I'm not sure why you have specified "without joins" - I get that you may not want to have triggers on a table (which you don't need to achieve this anyway), but a JOIN is standard SQL syntax that will help you achieve the result you are after.
Try:
SELECT
s1.code, s1.nic_number, s1.reg_number, s1.full_name
FROM subscribers s1
INNER JOIN
(
SELECT nic_number, reg_number
FROM subscribers
WHERE code = '123'
) s2
ON s1.nic_number = s2.nic_number
AND s1.reg_number = s2.reg_number
WHERE s1.code <> '123';
Or, if you really need to achieve it with no JOINs at all, then you're just doubling-up the sub-query that you need to include:
SELECT
s1.code, s1.nic_number, s1.reg_number, s1.full_name
FROM subscribers s1
WHERE s1.nic_number IN
(
SELECT nic_number FROM subscribers
WHERE code = '123'
)
AND s1.reg_number IN
(
SELECT reg_number FROM subscribers
WHERE code = '123'
)
AND s1.code <> '123';
The latter query is not necessarily ideal, but it still achieves the desired result.

Related

Select max date by grouping?

PLEASE will someone help? I've put HOURS into this silly, stupid problem. This stackoverview post is EXACTLY my question, and I have tried BOTH suggested solutions to no avail.
Here are MY specifics. I have extracted 4 records from my actual database, and excluded no fields:
master_id date_sent type mailing response
00001 2015-02-28 00:00:00 PHONE NULL NULL
00001 2015-03-13 14:45:20 EMAIL ThankYou.html NULL
00001 2015-03-13 14:34:43 EMAIL ThankYou.html NULL
00001 2015-01-11 00:00:00 EMAIL KS_PREVIEW TRUE
00001 2015-03-23 21:42:03 EMAIL MailChimp Update #2 NULL
(sorry about the alignment of the columns.)
I want to get the most recent mailing and date_sent for each master_id. (My extract is of only one master_id to make this post simple.)
So I run this query:
SELECT master_id,date_sent,mailing
FROM contact_copy
WHERE type="EMAIL"
and get the expected result:
master_id date_sent mailing
1 3/13/2015 14:45:20 ThankYou.html
1 3/13/2015 14:34:43 ThankYou.html
1 1/11/2015 0:00:00 KS_PREVIEW
1 3/23/2015 21:42:03 MailChimp Update #2
BUT, when I add this simple aggregation to get the most recent date:
SELECT master_id,max(date_sent),mailing
FROM contact_copy
WHERE type="EMAIL"
group BY master_id
;
I get an UNEXPECTED result:
master_id max(date_sent) mailing
00001 2015-03-23 21:42:03 ThankYou.html
So my question: why is it returning the WRONG MAILING?
It's making me nuts! Thanks.
By the way, I'm not a developer, so sorry if I'm breaking some etiquette rule of asking. :)
That's because when you use GROUP BY, all the columns have to be aggregate columns, and mailing is not one of them..
You should use a subquery or a join to make it work
SELECT master_id,date_sent,mailing
FROM contact_copy cc
JOIN
( SELECT master_id,max(date_sent)
FROM contact_copy
WHERE type="EMAIL"
group BY master_id
) result
ON cc.master_id= result.master_id AND cc.date_sent=result.date_sent
You're getting an "unexpected" result because of a MySQL specific extension to the GROUP BY functionality. The result you're getting is actually expected, according to the MySQL Reference Manual.
Ref: https://dev.mysql.com/doc/refman/5.5/en/group-by-handling.html
Other database engines would reject your query as invalid... an error along the lines of "non-aggregate expressions included in the SELECT list not included in the GROUP BY".)
We can get MySQL to behave like other databases (and return an error for that query) if we include ONLY_FULL_GROUP_BY in the SQL mode.
Ref: https://dev.mysql.com/doc/refman/5.5/en/sql-mode.html#sqlmode_only_full_group_by
To get the result you are looking for...
If the (master_id,type,date_sent) tuple is UNIQUE in contact_copy (that is, if for given values of master_id and type, there will be no "duplicate" values of date_sent), we could use a JOIN operation to retrieve the specified result.
First, we write a query to get the "maximum" date_sent for a given master_id and type. For example:
SELECT mc.master_id
, mc.type
, MAX(mc.date_sent) AS max_date_sent
FROM contact_copy mc
WHERE mc.master_id = '0001'
AND mc.type = 'EMAIL'
To retrieve the entire row associated with that "maximum" date_sent, we can use that query as an inline view. That is, wrap the query text in parens, assign an alias, and then reference that as if it were a table, for example:
SELECT c.master_id
, c.date_sent
, c.mailing
FROM ( SELECT mc.master_id
, mc.type
, MAX(mc.date_sent) AS max_date_sent
FROM contact_copy mc
WHERE mc.master_id = '0001'
AND mc.type = 'EMAIL'
) m
JOIN contact_copy c
ON c.master_id = m.master_id
AND c.type = m.type
AND c.date_sent = m.max_date_sent
Note that if there are multiple rows that have the same values of master_id,type and date_sent, there is potential to return more than one row. You could add a LIMIT 1 clause to guarantee that you return only one row; which of those rows is returned is indeterminate, without an ORDER BY clause before the LIMIT clause.

Subquery with max value in a big table SQL

I'm trying to make a query to get the date of last work experience of a person and also the date they left the company (in some cases that value is null because the person is still working on the company).
I have something like:
SELECT r.idcurriculum, r.startdate, r.lastdate FROM (
SELECT idcurriculum, max(startdate) as startdate
FROM workexperience
GROUP BY idcurriculum) as s
INNER JOIN workexperience r on (r.idcurriculum = s.idcurriculum)
The structure should come out something like this:
idcurriculum | startdate | lastdate
1234 | 2010-05-01| null
2532 | 2005-10-01| 2010-02-28
5234 | 2011-07-01| 2013-10-31
1025 | 2012-04-01| 2014-03-31
I tried running that query but I had to stop it because it was taking too long. The workexperience table weights aprox 20GB. I don't know if the query is wrong, I've only run it for 10 minutes.
Help will be much appreciated.
You might try rephrasing the query as:
select r.*
from workexperience we
where not exists (select 1
from workexperience we2
where we2.idcurriculum = we.idcurriculum and
we2.startdate > we.startdate
);
Important: for performance reasons you need a composite index on idcurriculum, startdate:
create index idx_workexperience_idcurriculum_startdate on workexperience(idcurriculum, strtdate)
The logic of the query is: "Get me all rows from workexperience where there is no row for the same idcurriculum that has a larger startdate". That is a fancy way of saying "get me the maximum".
With the group by, MySQL has to do an aggregation, which would typically involve sorting the data -- expensive on 20 Gbytes. With this method, it can look up the results using the index, which should be faster.
As an alternative to Gordon's answer you could also write the query as:
SELECT r.*
FROM work_experience we
LEFT JOIN work_experience we2
ON we2.idcurriculum = we.idcurriculum
AND we2.startdate > we.startdate
WHERE we2.idcurriculum IS NULL;
You can run into problems when there are multiple maximum start_dates in the group however.

Getting the MAX Record for the Most Recent Serial Numbers

I have the following table with some sample data.
Record_ID Counter Serial Owner
1 0 AAA Jack
2 1 AAA Kevin
3 0 BBB Jane
4 1 BBB Wendy
Based on data similar to the above, I am trying to write a SQL query for MySQL that gets the record with the maximum Counter value per Serial number. The part I seem to be having trouble with is getting the query to get the last 50 unique serial numbers that were updated.
Below is the query I came up with so far based on this StackOverflow question.
SELECT *
FROM `history` his
INNER JOIN(SELECT serial,
Max(counter) AS MaxCount
FROM `tracking`
WHERE serial IN (SELECT serial
FROM `history`)
GROUP BY serial
ORDER BY record_id DESC) q
ON his.serial = q.serial
AND his.counter = q.maxcount
LIMIT 0, 50
It looks like a classic greatest-n-per-group problem, which can be solved by something like this:
select his.Record_ID, his.Counter, his.Serial, his.Owner
from History his
inner join(
select Serial, max(Counter) Counter
from History
group by Serial
) ss on his.Serial = ss.Serial and his.Counter = ss.Counter
If you are to have specific filters on your data set, you should apply the said filters in the sub-query.
Another source with more explanation on the problem here: SQL Select only rows with Max Value on a Column

Complicated joining on multiple id's

I have a table like this
id | user_id | code | type | time
-----------------------------------
2 2 fdsa r 1358300000
3 2 barf r 1358311000
4 2 yack r 1358311220
5 3 surf r 1358311000
6 3 yooo r 1358300000
7 4 poot r 1358311220
I want to get the concatenated 'code' column for user 2 and user 3 for each matching time.
I want to receive a result set like this:
code | time
-------------------------------
fdsayooo 1358300000
barfsurf 1358311000
Please note that there is no yackpoot code because the query was not looking for user 4.
You can use GROUP_CONCAT function. Try this:
SELECT GROUP_CONCAT(code SEPARATOR '') code, time
FROM tbl
WHERE user_id in (2, 3)
GROUP BY time
HAVING COUNT(time) = 2;
SQL FIDDLE DEMO
What you are looking for is GROUP_CONCAT, but you are missing a lot of details in your question to provide a good example. This should get you started:
SELECT GROUP_CONCAT(code), time
FROM myTable
WHERE user_id in (2, 3)
GROUP BY time;
Missing details are:
Is there an order required? Not sure how ordering would be done useing grouping, would need to test if critical
Need other fields? If so you will likely end up needing to do a sub-select or secondary query.
Do you only want results with multiple times?
Do you really want no separator between values in the results column (specify the delimiter with SEPARATOR '' in the GROUP_CONCAT
Notes:
You can add more fields to the GROUP BY if you want to do it by something else (like user_id and time).

Excluding 'near' duplicates from a mysql query

We have an iPhone app that sends invoice data by each of our employees several times per day. When they are in low cell signal areas tickets can come in as duplicates, however they are assigned a unique 'job id' in the mysql database, so they're viewed as unique. I could exclude the job id and make the rest of the columns DISTINCT, which gives me the filtered rows I'm looking for (since literally every data point is identical except for the job id), however I need the job ID since it's the primary reference point for each invoice and is what I point to for: approvals, edits, etc.
So my question is, how can I filter out 'near' duplicate rows in my query, while still pulling in the job id for each ticket?
The current query is below:
SELECT * FROM jobs, users
WHERE jobs.job_csuper = users.user_id
AND users.user_email = '".$login."'
AND jobs.job_approverid1 = '0'
Thanks for looking into it!
Edit (examples provided):
This is what I meant by 'near duplicate'
Job_ID - Job_title - Job_user - Job_time - Job_date
2345 - Worked on circuits - John Smith - 1.50 - 2013-01-01
2344 - Worked on circuits - John Smith - 1.50 - 2013-01-01
2343 - Worked on circuits - John Smith - 1.50 - 2013-01-01
So everything is identical except for the Job_ID column.
You want a group by:
SELECT *
FROM jobs, users
WHERE jobs.job_csuper = users.user_id
AND users.user_email = '".$login."'
AND jobs.job_approverid1 = '0'
group by <all fields from jobs except jobid>
I think the final query should look something like this:
select min(Job_ID) as JobId, Job_title, user.name as Job_user, Job_time, Job_date
FROM jobs join users
on jobs.job_csuper = users.user_id
WHERE jusers.user_email = '".$login."' AND jobs.job_approverid1 = '0'
group by Job_title, user.name, Job_time, Job_date
(This uses ANSI syntax for joins and is explicit about the fields coming back.)
It's better to prevent the double submission.
Given that you cannot prevent the double submission...
I would query like this:
select
min(Job_ID) as real_job_id
,count(Job_ID) as num_dup_job_ids
,group_concat(Job_ID) as all_dup_job_ids
,j.Job_title, j.Job_user, j.Job_time, j.Job_date
from
jobs j
inner join users u on u.user_id = j.job_csuper
where
whatever_else
group by
j.Job_title, j.Job_user, j.Job_time, j.Job_date
That includes more than you explicitly asked for. But it's probably good to be reminded of how many dups you have, and it gives you easy access to the duplicate id info when you need it.
How about creating a hash for each row and comparing them:
`SHA1(concat_ws(field1, field2, field3, ...)) AS jobhash`