Improving query speed: simple SELECT from SELECT in huge table - mysql

I have a table contains 3 columns : age , name , nickname
I would like to get only the names (+age) where the name+age does not exist at all in nickname+age.
For example : if table : DETAILS contains 2 rows :
age: 5 , name: suzi, nickname: suzi
age:2 , name : gil, nickname: g
query will return : age:2 , name : gil
SELECT d1.AGE, d1.NAME
FROM DETAILS d1
WHERE d1.NAME NOT IN (SELECT d2.NICKNAME FROM DETAILS d2 WHERE d2.AGE = d1.AGE)
This query runs only on small data.
Any idea how to improve it?

The critical point in SQL query performance is using index. So you have to have the index in the querying/joining columns and you need to use it (via join).
E.g. query:
SELECT DISTINCT D1.AGE, D1.NAME
FROM DETAILS D1 LEFT JOIN DETAILS D2 ON D1.AGE = D2.AGE
WHERE D1.NAME <> D2.NICKNAME
Note that you have to create indexes on columns AGE, NAME, AND NICKNAME beforehand to fully benefit from this query.

Use Left Join/Left Outer Join instead of WHERE ... NOT IN ...
The orders of execution of SQL
FROM
ON
OUTER
WHERE
GROUP BY
CUBE | ROLLUP
HAVING
SELECT
DISTINCT
ORDER BY
TOP

You can easily get the result by this:
SELECT d1.AGE,
CASE WHEN d1.NAME IS NULL
THEN d1.NIKCNAME ELSE d1.NAME END as [NewName]
FROM DETAILS d1
INNER JOIN DETAILS d2
ON d1.AGE = d2.AGE
WHERE d2.NAME <> d1.NICKNAME

Related

Query involving join - combine data from 2 of our tables

We need to combine data from 2 of our tables - active_deals and deal_display_controls. The (relevant) columns in each of them are:
`active_deals` : `deal_id`, `region_code`
`deal_display_controls` : `deal_id`, `display_control_id`
The query needs to fetch all count of all the active deals having a given display_control_id and group them by their region_code (group by region code). Along with this, it also needs to group the active deals by region code (without the display_control_id constraint)
So, if the active deals tables looks like:
deal_id region_code
d1 US
d2 CA
d3 US
And the deal_display_controls looks like:
`deal_id` `display_control_id`
d1 dc1
d2 dc1
d3 dc2
d4 dc1
Then for a given display control id = “dc1”
The query should output:
`region_code` `count_of_active_deals_with_given_display_control_for_region_code `count_of_active_deals_for_region_code`
US 1 2
CA 1 1
I could do this by splitting the above question into 2 parts and writing individual queries for each part:
Part 1: To get the active deals for a given display_control_id and group by region code
Query for the same:
select count(*), region_code from active_deals join deal_display_controls on active_deals.bin_deal_id=deal_display_controls.deal_id where deal_display_controls.display_control_id=0xc04e20724f5f49c9a285cb3c98d777b4 group by active_deals.region_code;
O/p:
`region_code` `count_of_active_deals_with_given_display_control_for_region_code
US 1
CA 1
Part2: To get the active deals and group by region code
Query for the same:
select count(*), region_code from active_deals join deal_display_controls on active_deals.bin_deal_id=deal_display_controls.deal_id group by active_deals.region_code;
O/p:
`region_code` `count_of_active_deals_for_region_code`
US 2
CA 1
I need to way to combine these 2 queries into a single query. Is it possible to do this?
while counting you can add additional condition of control_id
select d.region_code
, count(distinct d.deal_id) count_of_active_deals_for_region_code
, SUM(c.display_control_id = 'dc1') count_of_active_deals_with_given_display_control_for_region_code
from active_deals d
left
join deal_display_controls c
on d.deal_id = c.deal_id
group
by d.region_code
Fiddle: http://sqlfiddle.com/#!9/904670/2
I think you want a join with conditional aggregation:
select d.region_code,
sum(ad.display_control_id = 1),
count(*)
from active_deals ad join
deals d
on ad.deal_id = d.deal_id
group by d.region_code

How to obtain data from sub query that has self join?

my table:
friends(uid_1 int, uid_2 int)
This is my query :
SELECT a.uid_1
, a.uid_2 as a2
, b.uid_1 as b1
, b.uid_2
from friend a
join friend b
on a.uid_1 = b.uid_2;
I want to obtain a2 and b1 from query for other purposes
So this query now:
Select a2,b1
from (SELECT a.uid_1,a.uid_2 as a2,b.uid_1 as b1,b.uid_2
from friend a join
friend b
on a.uid_1=b.uid_2
)
does not work. How do I obtain certain data from a range of displayed data?
Number 1, you are just missing with a Alias name for your sub query as below-
Select a2,b1
from (
SELECT a.uid_1,a.uid_2 as a2,b.uid_1 as b1,b.uid_2
from friend a join
friend b
on a.uid_1=b.uid_2
) A -- added A as a Alias
But number 2, not sure what you are trying to do with the JOIN as your query with implemented JOIN conditions is simply equivalent to below query-
SELECT *
FROM friend
WHERE uid_1 = uid_2
You need table name alias for the subquery eg : Select ... FROM (subquery ) T
then you can refer the subquery content with a fully qualified name
Select T.a2, T.b1
from (SELECT a.uid_1,a.uid_2 as a2, b.uid_1 as b1, b.uid_2 b2
from friend a join
friend b
on a.uid_1=b.uid_2
) T
You were just missing alias
Select a2,b1
from (SELECT a.uid_1,a.uid_2 as a2,b.uid_1 as b1,b.uid_2
from friend a join
friend b
on a.uid_1=b.uid_2
) as temp -- here it is

Migrating data from one database to another using LEFT JOIN

Let me start by saying I am relatively new to MySQL. I am trying to migrate data from one database to another. Let us call one database DB1 and the other DB2.
DB2 has tables the following tables.
Patient: id, person_id and start_regimen_id
Regimen: id, code
Visit: id, patient_id, regimen_id,next_appointment_date
DB1 has the following tables:
Patient: id,medical_record_number,current_regimen, start_regimen nextappointment
Now:
regimen_id data should be inserted to current_regimen
start_regimen_id data should be inserted to start_regimen
next_appointment_date should be inserted to nextappointment
This is what I have now:
SELECT
p.person_id AS medical_record_number,
r.code AS start_regimen,
??? AS current_regimen,
DATE_FORMAT(v.next_appointment_date, '%Y-%m-%d') AS
nextappointment,
FROM patient p
LEFT JOIN regimen r ON r.id = p.start_regimen_id
LEFT JOIN (
SELECT patient_id, MAX(next_appointment_date) as next_appointment_date
FROM visit
WHERE next_appointment_date IS NOT NULL
GROUP BY patient_id
) v ON v.patient_id = p.id
I have remained to migrate regimen_id (visit) on DB2 to current_regimen (patient) on DB1. I don't know how to use two LEFT JOIN to get data from two tables for one table.
Any assistance will be greatly appreciated because I am really stuck.
Seems like we would want to include regimen_id in the GROUP BY from the visit table, and then match that to id from the regimen table. Given the outer join, it appears that patient may not have any regimen associated, so I would include matching of NULL values of regimen_id.
LEFT
JOIN ( SELECT vv.patient_id
, vv.regimen_id
, MAX(vv.next_appointment_date) AS next_appointment_date
FROM visit vv
WHERE vv.next_appointment_date IS NOT NULL
GROUP
BY vv.patient_id
, vv.regimen_id
) v
ON v.patient_id = p.id
AND v.regimen_id <=> r.id
But that's just a guess. Without a specification (preferably illustrated by example data and and expected output) we're just guessing.
Note:
foo <=> bar
is a NULL-safe comparison, a shorthand equivalent to
( foo = bar OR ( foo IS NULL AND bar IS NULL ) )

Join 2 tables and display by id,priority & number

Example tables and desired result:
The result table shown below is the output I actually want.
tried the following query with pivot:
with pivot_data AS
(
select client_id
,ph_type
,Ph_number
from client_table
inner join phone_table
on client_table.phone_id = phone_table.ph_id
)
select *
from pivot_data
pivot (sum(ph_number)
for ph_type in ('c','w','h')
);
Result I got:
Any help would be appreciated.
Answers in sql server would be great but oracle & mysql is also welcome if they can point me in the right direction. :)
Thanks in advance.
Oracle Query:
SELECT *
FROM (
SELECT client_id, priority, phone_number, phone_type
FROM client_table c
LEFT OUTER JOIN
phone_table p
ON ( c.phone_id = p.phone_id )
)
PIVOT ( MAX( phone_type ) AS phonetype, MAX( phone_number ) AS phonenumber
FOR priority IN ( 1 AS Prio1, 2 AS Prio2, 3 AS Prio3 ) );
Output:
CLIENT_ID PRIO1_PHONETYPE PRIO1_PHONENUMBER PRIO2_PHONETYPE PRIO2_PHONENUMBER PRIO3_PHONETYPE PRIO3_PHONENUMBER
---------- --------------- ----------------- --------------- ----------------- --------------- -----------------
1 C 9999999999 H 5555555555 W 7777777777
You really need to do some reading on set based thinking and how what you are asking for will be very detrimental to your maintenance of the SSIS solution moving forwards.
All you need to do is export the data as is. If you absolutely have to have it all in the one CSV file, just join the two tables together and retain a normalised, scalable dataset that won't break if the number of priorities increases:
select c.client_id
,c.phone_id
,c.priority
,p.phone_type
,p.phone_number
from #Client c
join #Phone p
on c.phone_id = p.phone_id

SQL query for matching multiple values in the same column

I have a table in MySQL as follows.
Id Designation Years Employee
1 Soft.Egr 2000-2005 A
2 Soft.Egr 2000-2005 B
3 Soft.Egr 2000-2005 C
4 Sr.Soft.Egr 2005-2010 A
5 Sr.Soft.Egr 2005-2010 B
6 Pro.Mgr 2010-2012 A
I need to get the Employees who worked as Soft.Egr and Sr.Soft.Egr and Pro.Mgr. It is not possible to use IN or Multiple ANDs in the query. How to do this??
One way:
select Employee
from job_history
where Designation in ('Soft.Egr','Sr.Soft.Egr','Pro.Mgr')
group by Employee
having count(distinct Designation) = 3
What you might actually be looking for is relational division, even if your exercise requirements forbid using AND (for whatever reason?). This is tricky, but possible to express correctly in SQL.
Relational division in prosa means: Find those employees who have a record in the employees table for all existing designations. Or in SQL:
SELECT DISTINCT E1.Employee FROM Employees E1
WHERE NOT EXISTS (
SELECT 1 FROM Employees E2
WHERE NOT EXISTS (
SELECT 1 FROM Employees E3
WHERE E3.Employee = E1.Employee
AND E3.Designation = E2.Designation
)
)
To see the above query in action, consider this SQLFiddle
A good resource explaining relational division can be found here:
http://www.simple-talk.com/sql/t-sql-programming/divided-we-stand-the-sql-of-relational-division
If you need to get additional information back about each of the roles (like the dates) then joining back to your original table for each of the additional designations is a possible solution:
SELECT t.Employee, t.Designation, t.Years, t1.Designation, t1.Years, t2.Designation, t2.Years
FROM Table t
INNER JOIN t2 ON (t2.Employee = t.Employee AND t2.Designation = 'Sr.Soft.Egr')
INNER JOIN t3 ON (t3.Employee = t.Employee AND t3.Designation = 'Soft.Egr')
WHERE t.Designation = 'Pro.Mgr';
Why not the following (for postgresql)?
SELECT employee FROM Employees WHERE Designation ='Sr.Soft.Egr'
INTERSECT
SELECT employee FROM Employees WHERE Designation ='Soft.Egr'
INTERSECT
SELECT employee FROM Employees WHERE Designation ='Pro.Mgr'
Link to SQLfiddle
I know this might not optimized, but I find this much much easier to understand and modify.
Try this query:
SELECT DISTINCT t1.employee,
t1.designation
FROM tempEmployees t1, tempEmployees t2, tempEmployees t3
WHERE t1.employee = t2.employee AND
t2.employee = t3.employee AND
t3.employee = t1.employee AND
t1.designation != t2.designation AND
t2.designation != t3.designation AND
t3.designation != t1.designation