from the following table,
how will below output should be queried?
My initial query is:
select
bpai.sequence_id, bpai.last_name, bpai.given_name,
bpai.middle_name, bpai.middle_initial,
bpai.gender, bpai.birth_date, bpai.birth_place_via_psgc, bpai.citizenship,
bpai.primary_mobile_number, bpai.primary_email_address, count(*)
from bpaitbl bpai
inner join (select
*, count(*) as countof
from bpaitbl
group by
last_name, middle_name,
gender, birth_date, citizenship
having (count(*) > 1)
) profil on bpai.last_name like profil.middle_name
and bpai.gender = profil.gender
and bpai.birth_date = profil.birth_date
and bpai.citizenship = profil.citizenship
;
and can't make the output anyhow i tried. please help.
Since you don't say in your question, but mention in your title, what the result set represents, I presume you want a list of rows with duplicate records.
The following query will list all of the records with duplicate records, showing the record with the highest sequence_id of the duplicates:
SELECT
a.sequence_id,
a.last_name,
a.given_name,
a.middle_name,
a.gender,
a.birth_date,
a.citizenship
FROM bpaitbl a
LEFT JOIN bpaitbl b
ON a.last_name = b.last_name AND
a.given_name = b.given_name AND
a.middle_name = b.middle_name AND
a.gender = b.gender AND
a.birth_date = b.birth_date AND
a.citizenship = b.citizenship AND
a.sequence_id > MAX(b.sequence_id)
WHERE b.sequence_id IS NULL;
GROUP BY
a.last_name,
a.given_name,
a.middle_name,
a.gender,
a.birth_date,
a.citizenship
To show the record with the lowest sequence_id, change the > to <. If you want all rows, including those without duplicates, remove the where clause.
Related
I am running a SELECT query to return addresses in a table associated with a certain "applicant code" and I'd like to join a table to also return (in the same row) the name of that applicant.
Therefore my query as of now is
SELECT a.id, a.created_at, a.updated_at, a.code, a.applicant_code, a.form_code, a.address_line_1, a.address_line_2, a.town_city, a.county_state, a.country, a.post_code, a.start_date, a.end_date, a.type, ap.first_name, ap.last_name
FROM sfs_addresses a
JOIN sfs_personal_details ap ON a.form_code = ap.form_code
WHERE a.form_code = ? AND a.applicant_code = ?
The query works, and I get the right columns and values in each row, but it returns 2 of each so like
ID
===
1
1
2
2
3
3
4
4
If I remove the JOIN it works fine. I have tried adding DISTINCT (makes no difference) I'm lost.
EDIT: Based on this answer and the comments, the OP realized that the JOIN condition should be on applicant_code rather than form_code.
You have duplicates in the second table based on the JOIN key you are using (I question if the JOIN is correct).
If you just want one row arbitrarily, you can use row_number():
SELECT a.*, ap.first_name, ap.last_name
FROM sfs_addresses a JOIN
(SELECT ap.*,
ROW_NUMBER() OVER (PARTITION BY ap.form_code ORDER BY ap.form_code) as seqnum
FROM sfs_personal_details ap
) ap
ON a.form_code = ap.form_code
WHERE a.form_code = ? AND a.applicant_code = ?;
You can replace the columns in the ORDER BY with which result you want -- for instance the oldest or most recent.
Note: form_code seems like an odd JOIN column for a table called "personal details". So, you might just need to fix the JOIN condition.
relation between 2 tables one to many to return non duplicate use distinct
SELECT distinct a.id, a.created_at, a.updated_at, a.code, a.applicant_code, a.form_code, a.address_line_1, a.address_line_2, a.town_city, a.county_state, a.country, a.post_code, a.start_date, a.end_date, a.type, ap.first_name, ap.last_name
FROM sfs_addresses a
JOIN sfs_personal_details ap ON a.form_code = ap.form_code
WHERE a.form_code = ? AND a.applicant_code = ?
The following query always outputs SUM for all rows instead of per userid. Not sure where else to look. Please help.
SELECT * FROM assignments
LEFT JOIN (
SELECT SUM(timeworked) AS totaltimeworked
FROM time_entries
) assignments ON (userid = assignments.userid AND ticketid = ?)
WHERE ticketid = ?
ORDER BY assigned,scheduled
If you want to keep the SELECT *, you would have to add a group by clause in the subquery. Something like this
SELECT * FROM assignments
LEFT JOIN (
SELECT SUM(timeworked) AS totaltimeworked
FROM time_entries
GROUP BY userid
) time_entriesSummed ON time_entriesSummed.userid = assignments.userid
WHERE ticketid = ?
ORDER BY assigned,scheduled
But a better way would be to change the SELECT * to instead select the fields you want a add a group by clause directly. Something like this
SELECT
assignments.id,
assignments.assigned,
assignments.scheduled,
SUM(time_entries.timeworked) AS totalTimeworked
FROM assignments
LEFT JOIN time_entries
ON time_entries.userid = assignments.userid
GROUP BY assignments.id, assignments.assigned, assignments.scheduled
Edit 1
Included table names in query 2 as mentioned in chameera's comment below
I would like to know how I can write a SQL Script so a within a group of individuals initially selected:
SELECT [RECORDS].[CONSTITUENT_ID]
,[RECORDS].[FIRST_NAME]
,[RECORDS].[LAST_NAME]
,[DATEADDED]
,[DTE]
,[Amount]
,[REF]
,[TYPE]
FROM [re7].[dbo].[GIFT]
INNER JOIN [re7].[dbo].[RECORDS]
ON GIFT.CONSTIT_ID LIKE RECORDS.ID
WHERE ([DTE] BETWEEN '2/7/2015' AND '2/8/2015')
ORDER BY [DATEADDED] DESC
select only individuals who are "First Time Donors" (or someone who only has one gift in [re7].[dbo].[GIFT].
[RECORDS] is a table of all the constituents.
[GIFT] is a table of all recorded Gifts.
The output of the above Query, is just a table with:
CONSTITUENT_ID, FIRST_NAME, LAST_NAME, DATEADDED, DTE, Amount, REF, TYPE
I pretty much want to see the same output format, but I would like the query to select only CONSTITUENT_ID who only have 1 GIFT (by their Record ID) in [re7].[dbo].[GIFT].
I apologize for the lack of data to show. I wish I could describe better....
SELECT [RECORDS].[CONSTITUENT_ID]
,[RECORDS].[FIRST_NAME]
,[RECORDS].[LAST_NAME]
,[DATEADDED]
,[DTE]
,[Amount]
,[REF]
,[TYPE]
FROM [re7].[dbo].[GIFT]
INNER JOIN [re7].[dbo].[RECORDS]
ON GIFT.CONSTIT_ID LIKE RECORDS.ID
WHERE ([DTE] BETWEEN '2/7/2015' AND '2/8/2015')
AND GIFT.CONSTIT_ID IN (
SELECT CONSTIT_ID FROM re7.dbo.Gift GROUP BY CONSTIT_ID HAVING COUNT(*) = 1
) /* another option is to add a subquery to the query you already had */
ORDER BY [DATEADDED] DESC
This solution simply selects all the constituents who have made only one donation and then joins to that, thereby limiting the result set.
SELECT
r.[CONSTITUENT_ID]
,r.[FIRST_NAME]
,r.[LAST_NAME]
,[DATEADDED]
,[DTE]
,[Amount]
,[REF]
,[TYPE]
FROM
(select [CONSTIT_ID] from [re7].[dbo].[GIFT] group by [CONSTIT_ID] having count([CONSTIT_ID]) = 1) g1
inner join [re7].[dbo].[GIFT] g
on g.[CONSTIT_ID] = g1.[CONSTIT_ID]
INNER JOIN [re7].[dbo].[RECORDS] r
ON g.CONSTIT_ID LIKE r.RECORDS.ID
WHERE ([DTE] BETWEEN '2/7/2015' AND '2/8/2015')
ORDER BY [DATEADDED] DESC
I have the following syntactically incorrect query with aliases in_Degree and out_degree:
insert into userData
select user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME)in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name)out_degree,
in_degree + out_degree(freq)
from users u
The problem in the query is the the 4th item in the select list aliased as freq. I want the 4th item to have the value in_degree + out_degree. The brute force extremely slow solution would be to copy and past both subqueries and add them.
How can I make this fast and as simple as in_degree + out_degree?
You could use a subquery:
insert into userData
select user_name,
in_degree,
out_degree,
in_degree + out_degree
from
(
select user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME)in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name)out_degree
from users u
) src
Or you might be able to use:
insert into userData
select user_name,
count(distinct in_t.*) in_degree,
count(distinct out_t.*) out_degree,
count(distinct in_t.*) + count(distinct out_t.*)
from users u
left join tweets in_t
on u.USER_NAME = in_t.rt_user_name
left join tweets out_t
on u.USER_NAME = out_t.source_user_name
group by u.user_name
As you have discovered, you can't reference the aliases given in that select list, except in a HAVING clause or an ORDER BY clause.
One option is to use your query as an "inline view", and write a wrapper query around that.
remove the 4th (invalid) expression from the select list in your query,
wrap your query in a set of parens
follow the closing paren with an alias (e.g.) s
write a query around that, referencing the inline view as if it were a table
the select list on the outer query can reference the "aliases" defined in the inline view.
However, if you want to make this "fast", you might consider (as an option) taking an entirely different tack. Rather than using correlated subqueries to get the count for each individal user, you could get the counts for all users, and then use LEFT JOIN operator, e.g.
SELECT u.user_name
, IFNULL(i.cnt,0) AS in_degree
, IFNULL(o.cnt,0) AS out_degree
, IFNULL(i.cnt,0)+IFNULL(o.cnt,0) AS freq
FROM users u
LEFT
JOIN (SELECT rt_user_name, COUNT(*) AS cnt FROM tweets
GROUP BY rt_user_name) i
ON i.rt_user_name = u.user_name
LEFT
JOIN (SELECT source_user_name, COUNT(*) AS cnt FROM tweets
GROUP BY source_user_name) o
ON o.source_user_name = u.user_name
This should work:
insert into userData
SELECT T.user_name,
T.in_degree,
T.out_degree,
(T.in_degree + T.out_degree) as freq
FROM (SELECT user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME) as in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name) as out_degree
FROM users u) T
In a fast way, I would do something like:
insert into userData
select
TMP.user_name,
TMP.in_degree,
TMP.out_degree,
(TMP.in_degree + TMP.out_degree) degreeSum
from(
select user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME)in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name)out_degree
from users u
) TMP
I'm have trouble counting/grouping the results of an inner join
I have two tables
results_dump: Which has two columns: email and result (the result value can be either open or bounce)
all_data: Which has three columns: email, full_name and address
The first goal is to query the result_dump table and count and group the number of times the result is "open" for a specific email.
This query works great:
SELECT `email`, COUNT(*) AS count
FROM `result_dump`
WHERE `date` = "open"
GROUP BY `email`
HAVING COUNT(*) > 3
ORDER BY count DESC
The second goal it to take those results (anyone who "open" more then 3 time) and pull in the 'full_name' and 'address' so I will have details on who opened an email 3+ times.
I have this query and it works as far as getting the data together - But I can't figure out how to get the COUNT, HAVING and ORDER to work with the INNER JOIN?
SELECT *
FROM all_data
INNER JOIN result_dump ON
all_data.email = result_dump.email
where `result` = "open"
SELECT email,name,count(*)
FROM all_data
INNER JOIN result_dump ON
all_data.email = result_dump.email
where `result` = "open"
group by result_dump.email
having count(*)>3
ORDER by count DESC
Nothing wrong with this one I think.
Try with following query:
SELECT * FROM all_data AS a
INNER JOIN
(SELECT * FROM result_dump where email IN
(SELECT `email`
FROM `result_dump`
WHERE `date` = "open"
GROUP BY `email`
HAVING count(email) >3
ORDER BY count(email) DESC)) AS b
ON a.email = b.email
WHERE b.`result` = "open"
This is Works Fine...! Try to this..
SELECT title.title
,count(*)
,title.production_year
,title.id as movie_id
,title.flag as language
,movie_info.info
FROM title INNER JOIN movie_info ON title.id=movie_info.movie_id;