How to optimize query with many joins? - mysql

I have simple but long query which count the content of the result it takes about 14 seconds. the count itself on the main table takes less than a second but after multiple join the delay is too high as follow
Select Count(Distinct visits.id) As Count_id
From visits
Left Join clients_locations ON visits.client_location_id = clients_locations.id
Left Join clients ON clients_locations.client_id = clients.id
Left Join locations ON clients_locations.location_id = locations.id
Left Join users ON visits.user_id = users.id
Left Join potentialities ON clients_locations.potentiality = potentialities.id
Left Join classes ON clients_locations.class = classes.id
Left Join professions ON clients.profession_id = professions.id
Inner Join specialties ON clients.specialty_id = specialties.id
Left Join districts ON locations.district_id = districts.id
Left Join provinces ON districts.province_id = provinces.id
Left Join locations_types ON locations.location_type_id = locations_types.id
Left Join areas ON clients_locations.area_id = areas.id
Left Join calls ON calls.visit_id = visits.id
The output of explain is
+---+---+---+---+---+---+---+---+---+---+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+---+---+---+---+---+---+---+---+---+---+
| 1 | SIMPLE | specialties | index | PRIMARY | specialty_name | 52 | NULL | 53 | Using index |
| 1 | SIMPLE | clients | ref | PRIMARY,specialty | specialty | 4 | crm_db.specialties.id | 143 | |
| 1 | SIMPLE | clients_locations | ref | PRIMARY,client_id | client_id | 4 | crm_db.clients.id | 1 | |
| 1 | SIMPLE | locations | eq_ref | PRIMARY | PRIMARY | 4 | crm_db.clients_locations.location_id | 1 | |
| 1 | SIMPLE | districts | eq_ref | PRIMARY | PRIMARY | 4 | crm_db.locations.district_id | 1 | Using where |
| 1 | SIMPLE | visits | ref | unique_visit,client_location_id | unique_visit | 4 | crm_db.clients_locations.id | 4 | Using index |
| 1 | SIMPLE | calls | ref | call_unique,visit_id | call_unique | 4 | crm_db.visits.id | 1 | Using index |
+---+---+---+---+---+---+---+---+---+---+
Update 1
The above query used with dynamic where statement $sql = $sql . "Where ". $whereFilter but the i submitted it in simple form . So do not consider the answer just eleminate the joins :)
Update 2
Here is example of dynamic filtering
$temp = $this->province_id;
if ($temp != null) {
$whereFilter = $whereFilter . " and provinces.id In ($temp) ";
}
But in startup case which is our case no where statement

Left joins always return a row from the first table, but may return multiple rows if there are multiple matching rows. But because you are counting distinct visit rows, left joining to another table while counting distinct visits is the same as just counting the rows of visits. Thus the only joins that affect the result are inner joins, so you can remove all "completely" left joined tables without affecting the result.
What I mean by "completely" is that some left joined tables are effectively inner joined; the inner join to specialty requires the join to clients to succeed and thus also be an inner join, which in turn requires the join to clients_locations to succeed and thus also be an inner join.
Your query (as posted) can be reduced to:
Select Count(Distinct visits.id) As Count_id
From visits
Join clients_locations ON visits.client_location_id = clients_locations.id
Join clients ON clients_locations.client_id = clients.id
Join specialties ON clients.specialty_id = specialties.id
Removing all those unnecessary joins will however greatly improve the runtime of your query, not only because there are less joins to make but also because the resulting rowset size could be enormous when you consider that the size is the product of the matches in all the tables (not the sum.
For maximum performance, create a covering indexes on all id-and-fk columns:
create index visits_id_client_location_id on visits(id, client_location_id);
create index clients_locations_id_client_id on clients_locations(id, client_id);
create index clients_id_specialty_id on clients(id, specialty_id);
so index-only scans can be used where possible. I assume there are indexes on the PK columns.

You don't seem to have any (or much) intentional filtering. If you want to know the number of visits referred to in calls, I would propose:
select count(distinct c.visit_id)
from calls c;

in order to optimize the whole process you can dynamically construct the pre-where SQL according to the filters you are going to apply. Like:
// base select and left join
$preSQL = "Select Count(Distinct visits.id) As Count_id From visits ";
$preSQL .= "Left Join clients_locations ON visits.client_location_id = clients_locations.id ";
// filtering by province_id
$temp = $this->province_id;
if ($temp != null) {
$preSQL .= "Left Join locations ON clients_locations.location_id = locations.id ";
$preSQL .= "Left Join districts ON locations.district_id = districts.id ";
$preSQL .= "Left Join provinces ON districts.province_id = provinces.id ";
$whereFilter = "provinces.id In ($temp) ";
}
$sql = $preSQL . "Where ". $whereFilter;
// ...
If you are using multiple filters you can put all inner/left-join strings in an array and then after analysing the request, you can construct your $preSQL using the minimum of joins.

Use COUNT(CASE WHEN visit_id!="" THEN 1 END) as visit.
Hope this will help

Isn't it just:
SELECT COUNT(id)
FROM visits
because all the left outer joins also return a visits.id when theres no matching clients, ..., calls and id's ought to be unique?
Different hint: The one inner join also is only effective when a client exists. Generally when needing inner joins they must be put as high/near as possible to the source table, so in your example it would have been best in the line after "left join clients".

I didn't understand too much your idea, specially your INNER JOIN that will tranform some LEFT in INNER JOINs, it seems strange, but lets try a solution:
Usually the LEFT JOINs has a very bad performance, and I think you'll need them only if you'll use them in WHERE clause, then you can include them with INNER JOIN only if you'll use them.
For example:
$query = "Select Count(Distinct visits.id) As Count_id From visits ";
if($temp != null){
$query .= " INNER JOIN clients_locations ON visits.client_location_id = clients_locations.id ";
$query .= " INNER JOIN locations ON clients_locations.location_id = locations.id ";
$query .= " INNER JOIN locations ON clients_locations.location_id = locations.id ";
$query .= " INNER JOIN districts ON locations.district_id = districts.id "
$query .= " INNER JOIN provinces ON districts.province_id = provinces.id ";
$whereFilter .= " and provinces.id In ($temp) ";
}
I think it'll help your performance and it'll works as you need.

Related

Informatica: How to make a condition for two tables joining execution

Actually, there are total 4 tables invoked in this mapping: Market,Cost, A, B,
Read_sourceTB_B-----FIL1------->---------JNR4 \
| | |
| Read_sourceTB_Market--\ | |
| Read_sourceTB_Cost------JNR1--\ | |
| Read_sourceTB_A-----------------JNR2 JNR5--->EXP... -->TGT
| | | |
| | | |
| | | |
---------------------FIL2->---------JNR3 /
SQ_TABLEB --FIL1-> -- JNR1 \
| | |
| SQ_TABLEA --| JNR3-->EXP.... -->TGT
| | |
|--FIL2-> -- JNR2 /
**First **joinning condition
A LEFT JOIN B
ON A.MEMBERSHIPID = B.MEMBERSHIPID
Where B.System_Code='University'
IF <First joinning condition> failed, then execute
**Second **joinning condition
A LEFT JOIN B ON
A.address = B.address and A.phonenumber = B.phonenumber
Where B.System_Code='Policy'
Which transformation should I use? I don't know how to use Informatica, my version is Informatica Developer 10.5, please help me.Thanks!
I only know how to
A left join B on `condition` `System_Code='University'`
left join B on `condition` `System_Code='Policy'`
but I don't know how to make a decision for
if A join B System_Code='University'failed,
then A join B System_Code='Policy'
You need to join A with B (twice) based on two different condition and then join them back to one single pipeline for a decision/if-else condition.
Also please note, all your left joins are actually inner join because you are using B.xxx='something' condition in the where clause.
So, considering above problem -
After source qualified of B, add two filters FIL1(system_Code='University') and FIL2(System_Code='Policy') in parallel.
Then use JNR1 to join A and B(FIL1) using JOINER on A.MEMBERSHIPID = B_F1.MEMBERSHIPID. Use A as detail table and use 'inner join'.
Then join A and B(FIL2) using JOINER(JNR2) on A.address = B_F2.address and A.phonenumber = B_F2.phonenumber. Use A as detail table and use 'inner join'.
Then join above two pipelines into one single pipeline using another Joiner(JNR3). It should be normal join and join should be primary key from table A. Get all required columns.
(EXP)Then use an expression transformation. Use logic similar to below.
out_col1 = IIF( isnull(col_tableB_F1_jnr1),col_tableB_F2_jnr2, col_tableB_F1_jnr1)
Whole mapping should look like this -
SQ_TABLEB --FIL1-> -- JNR1 \
| | |
| SQ_TABLEA --| JNR3-->EXP.... -->TGT
| | |
|--FIL2-> -- JNR2 /
But i think your requirement may be like this -
A LEFT JOIN B
ON A.MEMBERSHIPID = B.MEMBERSHIPID AND B.System_Code='University'
if yes, then change the inner join to master outer join in the JNR1 and JNR2.

how to join many tables

This is my database:
contact_photo and contact_video are 2 "join" tables. They have a foreign key to my_contact table and a foreign key to photo/video tables.
Is there a way, with one query, to retrieve all the records in those 5 tables, starting from the value of my_contact.id?
I know how to join, for example, my_contact, contact_photo, and photo,
but I have no idea how to join those 5 tables together.
I tried:
query = "SELECT * " +
"FROM my_contact m " +
"INNER JOIN contact_photo cp ON ( m._id = cp.id_contact ) " +
"INNER JOIN photo p ON ( cp.id_photo = p._id) " +
"INNER JOIN contact_video cv ON ( m._id = cv.id_contact ) " +
"INNER JOIN video v ON ( cv.id_video = v._id) ";
But I get no record, even if my_contact, contact_photo and photo have records (contact_video and video are empty tables). Is there something wrong in my logic?
The result I'd like to get are records like these:
--------------------------------------------------------------------------
| my_contact._id | photo.id_photo_on_device | photo.uri |
--------------------------------------------------------------------------
| 1 | 23 | C:\PROGRAM|.... |
--------------------------------------------------------------------------
--------------------------------------------------------------------------
| my_contact._id | video.id_video_on_device | video.uri |
--------------------------------------------------------------------------
| 1 | 36 | C:\PROGRAM|.... |
--------------------------------------------------------------------------
Actually, looking at the records I'd like to get it seems there's something wrong in my logic.
I recommend a union all query. The structure will resemble this:
select c.uri
, c.i_contact_on_device
, other useful fields
from my_contact c join contact_photo cp on c._id = cp.id_contact
join photo p on p._id = cp.id_photo
union all
same general idea, but for videos
Note that not all fields will be useful. For example, even if you wanted the the _id field from my_contact, you probably only need it once. That's why selecting fields from the junction tables is probably unnecessary.

two inner join in one sql statement

What's wrong here ? i just want to display all the item in item_tb with 2 different group , vicma and branch but it returns nothing. It only works in one inner join but when i join the other one it display nothing.
|-------------|-------------------------|---------------|
|item_tb | vicma_tb | branch_tb |
| | vID - PK | id-PK |
|branchID-FK | | |
|vicma - FK | | |
|-------------|-------------------------|---------------|
$sql = "
SELECT item_tb.*
, branch_tb.*
, vicma_tb.*
from item_tb
JOIN branch_tb
on item_tb.branchID = branch_tb.id
JOIN vicma_tb
on item_tb.vicma = vicma_tb.vID ";
Seems like you need to do a LEFT JOIN instead of INNER JOIN. LEFT JOIN will return all values from your original table and NULL if there is no match. Try:
SELECT item_tb.*, branch_tb.* , vicma_tb.* from item_tb
LEFT JOIN branch_tb on item_tb.branchID = branch_tb.id
LEFT JOIN vicma_tb on item_tb.vicma = vicma_tb.vID

SQL Query giving an error of Unknown Column

I am creating an application in PHP of Power Meter Analysis. I have following table structure:
table: 'feeds'
| feed_id | device_no | current1 | voltage1 | power_factor_1 | vc1 | ic1 | date_added
-------------------------------------------------------------------------------------
| 36752 | 2 | 36.048 | 196.01 | 0.9 | 1 | 1 | 2014-06-23 14:14:44
| 36753 | 2 | 35.963 | 195.59 | 0.9 | 1 | 1 | 2014-06-23 14:15:34
and so on.
table: 'machine'
| machine_id | machine_phone | machine_name | company_id |
----------------------------------------------------------
| 1 | 2 | ABC Machine | 1 |
| 2 | 093 | DEF Machine | 1 |
I need records on hourly basis and I have written the following query for this purpose:
$sql = "
SELECT
SUM(t.power1) AS 'power1'
, HOUR(t.date) AS 'pulse_hour'
FROM (
SELECT
IF(#diff = 0, 0, (((f.voltage1*f.vc1)*(f.current1*f.ic1)*(f.power_factor_1))/1000) * (#diff/3600)) AS 'power1'
, IF(#diff = 0,0, TIME_TO_SEC(f.date_added) - #diff) AS 'deltaT'
, #diff := TIME_TO_SEC(f.date_added)
, f.date_added AS 'date'
FROM
feeds f,
(SELECT #diff := 0) AS X
left join
machine m
on
f.device_no = m.machine_phone
left join
company c
on
c.company_id = m.company_id
";
$sql .= $params['machine_id'] ? " where f.device_no = '".$params['machine_id']."'" : " where f.device_no > 0";
$sql .= $params['machine_pulse_datetime_from'] ? " and f.date_added >= '".$params['machine_pulse_datetime_from']."'" : "";
$sql .= $params['machine_pulse_datetime_to'] ? " and f.date_added <= '".$params['machine_pulse_datetime_to']."'" : "";
$sql .= $params['company_id'] ? " and c.company_id = '".$params['company_id']."'" : "";
$sql .= "
ORDER BY
f.date_added ASC
) t
GROUP BY HOUR(t.date)
ORDER BY HOUR(t.date) ASC
";
The query is running OK if I remove the following part from the query:
left join
machine m
on
f.device_no = m.machine_phone
left join
company c
on
c.company_id = m.company_id
But with this part It is giving me following error:
Error Code : 1054
Unknown column 'f.device_no' in 'on clause'
can you please help me to sort this out... I have spent an hour with this query :(
This is your join:
FROM feeds f,
(SELECT #diff := 0) AS X left join
machine m
on f.device_no = m.machine_phone left join
company c
on c.company_id = m.company_id
The problem is that you are mixing explicit and implicit joins. You can fix this by replacing the comma with cross join:
FROM feeds f cross join
(SELECT #diff := 0) AS X left join
machine m
on f.device_no = m.machine_phone left join
company c
on c.company_id = m.company_id
The issue, which is buried deep in the documentation for select, is that , is a lot like a cross join with the exception of scoping rules -- that is, when the table aliases are recognized. With a comma, the table aliases are not recognized as you expect. They are with a cross join.
Here is the reference:
INNER [CROSS] JOIN and , (comma) are semantically equivalent in the
absence of a join condition: both produce a Cartesian product between
the specified tables (that is, each and every row in the first table
is joined to each and every row in the second table).
However, the precedence of the comma operator is less than of INNER
JOIN, CROSS JOIN, LEFT JOIN, and so on. If you mix comma joins
with the other join types when there is a join condition, an error of
the form Unknown column 'col_name' in 'on clause' may occur.
Information about dealing with this problem is given later in this
section.

How would I execute this complex conditional multi-table MySQL join (queries provided)?

Ok I have a few tables tables. I am only showing relevant fields:
items:
----------------------------------------------------------------
name | owner_id | location_id | cab_id | description |
----------------------------------------------------------------
itm_A | 11 | 23 | 100 | Blah |
----------------------------------------------------------------
.
.
.
users:
-------------------------
id | name |
-------------------------
11 | John |
-------------------------
.
.
.
locations
-------------------------
id | name |
-------------------------
23 | Seattle |
-------------------------
.
.
.
cabs
id | location_id | name
-----------------------------------
100 | 23 | Cool |
-----------------------------------
101 | 24 | Cool |
-----------------------------------
102 | 24 |thecab |
-----------------------------------
I am trying to SELECT all items (and their owner info) that are from Seattle OR Denver, but if they are in Seattle they can only be in the cab NAMED Cool and if they are in Denver they can only be in the cab named 'thecab' (not Denver AND cool).
This query doesn't work but I hope it explains what I am trying to accomplish:
SELECT DISTINCT
`item`.`name`,
`item`.`owner_id`,
`item`.`description`,
`user`.`name`,
IF(`loc`.`name` = 'Seattle' AND `cab`.`name` = 'Cool',1,0) AS `cab_test_1`,
IF(`loc`.`name` = 'Denver' AND `cab`.`name` = 'thecab',1,0) AS `cab_test_2`,
FROM `items` AS `item`
LEFT JOIN `users` AS `user` ON `item`.`owner_id` = `user`.`id`
LEFT JOIN `locations` AS `loc` ON `item`.`location_id` = `loc`.`location_id`
LEFT JOIN `cabs` AS `cab` ON `item`.`cab_id` = `cabs`.`id`
WHERE (`loc`.`name` IN ("Seattle","Denver")) AND `cab_test_1` = 1 AND `cab_test_2` = 1
I'd rather get rid of the IFs is possible. It seems inefficent, looks clunky, and is not scalable if I have a lot of location\name pairs
Try this:
SELECT DISTINCT
item.name,
item.owner_id,
item.description,
user.name
FROM items AS item
LEFT JOIN users AS user ON item.owner_id = user.id
LEFT JOIN locations AS loc ON item.location_id = loc.id
LEFT JOIN cabs AS cab ON item.cab_id = cabs.id
WHERE ((loc.name = 'Seattle' AND cab.name = 'Cool')
OR (loc.name = 'Denver' AND cab.name = 'thecab'))
My first thought is to store the pairs of locations and cab names in a separate table. Well not quite a table, but a derived table generated by a subquery.
You still have the problem of pivoting the test results into separate columns. The code can be simplified by making use of mysql boolean expressions, which get rid of the need for a case or if.
So, the approach is to use the same joins you have (although left join is not needed because the comparison on cab.name turns them in to inner joins). Then add a table of the pairs you are looking for, along with the "test name" for the pair. The final step is an explicit group by and a check whether conditions are met for each test:
SELECT i.`name`, i.`owner_id`, i.`description`, u.`name`,
max(pairs.test_name = 'test_1') as cab_test_1,
max(pairs.test_name = 'test_2') as cab_test_2
FROM `items` i LEFT JOIN
`users` u
ON i.`owner_id` = u.`id` LEFT JOIN
`locations` l`
ON i.`location_id` = l.`location_id` left join
`cabs` c
ON i.`cab_id` = c.`id` join
(select 'test_1' as testname, 'Seattle' as loc, 'cool' as cabname union all
select 'test_2', 'Denver', 'thecab'
) pairs
on l.name = pairs.name and
l.cabname = c.name
group by i.`name`, i.`owner_id`, i.`description`, u.`name`;
To add in additional pairs, add them into the pairs table along, and add an appropriate line in the select for the test flag.