Use NEWID() without losing distinct? - sql-server-2008

I am trying to create a new data extract from a (badly designed) sql database. The customer requires that I add a distinctidentifier which I am attempting to do using the NEWID() function. Unfortunately this leads to multiple duplicate records being returned.
After a bit of research I have found that the NEWID() function does indeed 'undo' the use of the distinct keyword, but I cannot work out why or how to overcome this.
An example of the query I am trying to write is as follows:
select distinct
NEWID() as UUID
,Histo_Results_File.ISRN
,Histo_Results_File.Internal_Patient_No
,Histo_Results_File.Date_of_Birth
,Histo_Result_freetext.histo_report
,Histo_Report.Date_Report_Updated as [Investigation_Result_Date]
from apex.Histo_Results_File
inner join apex.Histo_Report on (Histo_Report.Histo_Results_File = Histo_Results_File.ID)
If I miss out the NEWID() line in the select block, I get 569 records returned, which is correct, but if I include that line then I get in excess of 30,000 which are all duplicates of the original 569 but with different IDs. Can anyone suggest a way around this problem?
Thanks in advance

Use a sub query would be the easiest way to do it.
SELECT NEWID() as UUID
, * -- this is everything from below
FROM (
select distinct
Histo_Results_File.ISRN
,Histo_Results_File.Internal_Patient_No
,Histo_Results_File.Date_of_Birth
,Histo_Result_freetext.histo_report
,Histo_Report.Date_Report_Updated as [Investigation_Result_Date]
from apex.Histo_Results_File
inner join apex.Histo_Report on (Histo_Report.Histo_Results_File = Histo_Results_File.ID)) as mySub

select NEWID() as UUID
,ISRN
,Internal_Patient_No
,Date_of_Birth
,histo_report
,Investigation_Result_Date
from (
select distinct
,Histo_Results_File.ISRN
,Histo_Results_File.Internal_Patient_No
,Histo_Results_File.Date_of_Birth
,Histo_Result_freetext.histo_report
,Histo_Report.Date_Report_Updated as [Investigation_Result_Date]
from apex.Histo_Results_File
inner join apex.Histo_Report on (Histo_Report.Histo_Results_File = Histo_Results_File.ID)) t

You can use a sub-query to get around the issue, something like.....
SELECT NEWID() as UUID
,*
FROM (
select distinct
Histo_Results_File.ISRN
,Histo_Results_File.Internal_Patient_No
,Histo_Results_File.Date_of_Birth
,Histo_Result_freetext.histo_report
,Histo_Report.Date_Report_Updated as [Investigation_Result_Date]
from apex.Histo_Results_File
inner join apex.Histo_Report
on (Histo_Report.Histo_Results_File = Histo_Results_File.ID)
) t

Related

What is the best solution for adding INDEX to speed up the query?

Now I have a Query that runs 50 minutes on Mysql database and I can't accept that...
I want this process can running under 15 minutes....
insert into appianData.IC_DeletedRecords
(sourcetableid,
concatkey,
sourcetablecode)
select mstid,
concatkey,
'icmstlocationheader' as sourcetablecode
from appianData.IC_MST_LocationHeader
where concatkey not in(select concatkey
from appianData.IC_PURGE_LocationHeader)
The "sourcetableid" and "mstid" are unique.
So what is the best way to add INDEX or optimize on this?
Thank you
I would write the select as:
select mstid, concatkey, 'icmstlocationheader' as sourcetablecode
from appianData.IC_MST_LocationHeader lh
where not exists (select 1
from appianData.IC_PURGE_LocationHeader lhp
where lhp.concatkey = lh.concatkey
);
Then you want an index on IC_PURGE_LocationHeader(concatkey).
Since it is a NOT IN condition you should be able to use a "LEFT JOIN ... WHERE rightTable has no match" without concern for multiple matches inflating the results.
INSERT INTO appianData.IC_DeletedRecords (sourcetableid, concatkey, sourcetablecode)
SELECT m.mstid, m.concatkey, 'icmstlocationheader' as sourcetablecode
FROM appianData.IC_MST_LocationHeader AS m
LEFT JOIN appianData.IC_PURGE_LocationHeader AS p
ON m.concatkey = p.concatkey
WHERE p.concatkey IS NULL
;
With this version query, or the one you presented in the question, indexes on concatkey in both source tables should help significantly.

mysql subquery - results do not fulfill both

I have the following SQL query for MySQL:
SELECT SQL_CALC_FOUND_ROWS objects.objects_no
FROM objects
LEFT JOIN finds ON (objects.objects_no = finds.objects_no)
LEFT JOIN ceramics ON (objects.objects_no = ceramics.objects_no)
WHERE 1=1
and (objects.objects_no) in (select DISTINCT objects_no from objects_materials where thesaurus_term_id in (18658))
and (objects.objects_no) in (select DISTINCT objects_no from objects_objects where thesaurus_term_id in (24193))
GROUP BY objects.objects_no
ORDER BY objects.objects_no
Instead of getting results that match both subqueries, I also get results that match one or the other. Does anyone have an idea why that is?
Thanks, Sandro
Try parenthesizing the conditions.
WHERE (
(1=1)
and ((objects.objects_no) in (select DISTINCT objects_no from objects_materials where thesaurus_term_id in (18658)))
and ((objects.objects_no) in (select DISTINCT objects_no from objects_objects where thesaurus_term_id in (24193)))
)
Thanks for all your help. It actually does work just fine. There was a problem within the data inside the thesaurus.
Sorry!!!

How to get two columns of two different tables where id = id and id= 9?

my first query :
SELECT `oc_banner_image_description`.`title`
FROM `oc_banner_image_description`
WHERE `banner_id`=9
my second query:
SELECT `oc_banner_image`.`image` FROM `oc_banner_image` WHERE `banner_id`=9
how to make this two queries into single query using sql joins.
Using standard join syntax would look like this :
SELECT `oc_banner_image_description`.`title`, `oc_banner_image`.`image`
FROM `oc_banner_image_description`
JOIN `oc_banner_image` ON `oc_banner_image_description`.`banner_id` = `oc_banner_image`.`banner_id`
WHERE `oc_banner_image`.`banner_id`=9
Try this (You may need to use the single quotes a bit different than what I have)
SELECT `i`.`image`, `d`.`title`
FROM `oc_banner_image` as `i`, `oc_banner_image_description` as `d`
WHERE `c.banner_id` = `i.banner_id`
and i.`banner_id`=9
If this is not working try this on both tables
select banner_id, count(banner_id)
from oc_banner_image
group by banner_id order desc;
This will tell you if you have multiple banner_id's in the oc_banner_image table.
Try this
SELECT bannerDesc.title , bannerImage.image
FROM oc_banner_image_description bannerDesc join oc_banner_image bannerImage
on bannerDesc.banner_id = bannerImage.banner_id
WHERE bannerImage.banner_id=9

Why is this MySQL statement pulling data from both tables on this JOIN?

select * from user_levels
join collectors_users on user_levels.id = collectors_users.user_level
where collectors_users.username = 'testuser'
I want it to pull everything from user_levels and nothing from collectors_users. But it's pulling from both. How do I correct the statement?
Instead of select * specify what you actually want and use select user_levels.* or even better skip the * and write out the columns you want (and consider using aliases to keep it short and tidy): select ul.col1, ul.col2 ... from userlevels ul join ...
It is getting all the data as the '*' means 'all' columns. You can limit the columns for just one table by specifying the table:
select user_levels.*
from user_levels
join collectors_users on user_levels.id = collectors_users.user_level
where collectors_users.username = 'testuser'
Pro tip: Don't use SELECT * in running software. Instead, be as specific as you can be about the columns you want in your result set.
SELECT user_levels.*
should help a bit.
I might suggest that you use in or exists, because this is more consistent with the intention of the query:
select ul.*
from user_levels ul
where ul.id in (select cu.user_level
from collectors_users cu
where cu.username = 'testuser'
);
In addition, this version will not produce duplicate rows if collectors_users has multiple matching rows for a singel row in user_levels.
Also note the use of table aliases: these make the query easier to write and to read.

Using an "AS" clause in a sub-query than contains a "WHERE" clause

I want to store result in the variable generated by using "AS" clause in the MS Access,
and use this result in the sub-query with WHERE clause.
I tried this:
SELECT en_date AS date_en, (select sum(amount)
from main where
CrDb='Cr'
and
en_date=date_en) AS CR_AMT
FROM main
GROUP BY en_date;
I'm fairly certain you can't use an alias (the AS destination) in the same SELECT you defined it in.
I can't tell quite what you're trying to do, but it kind of looks like you want to be joining the table to itself.
SELECT
en_date,
SUM(amount)
FROM
main a
INNER JOIN
(
SELECT
en_date AS date_en,
CrDb
FROM
main
WHERE
CrDb='Cr'
)b
ON
a.en_date = b.date_en
AND a.CrDb = b.CrDb
GROUP BY
en_date
select m.en_date date_en, sum(m.amount)
from main m
where CrDb = 'Cr'
group by m.en_date
In other words, I dont think you even need a sub-query to get the results you are looking for.