Unique Left Join Gives Skewed Results - mysql

I'm very perplexed! I have 3 tables that separately give me the results I'm looking for. When I join them (tried join, union, left join, sub query) I get skewed results.
Table 1:
SELECT DISTINCT JB.job_id, AR.job_id as jid, SUM(AR.ar_payment_amount) AS sum,
JB.marketing_campaign FROM job_tbl JB LEFT JOIN ar_payment_tbl AR ON JB.job_id =
AR.job_id WHERE JB.marketing_campaign != '' AND FROM_UNIXTIME(AR.ar_payment_date,'%Y') =
YEAR(NOW()) GROUP BY JB.marketing_campaign ORDER BY sum DESC LIMIT 10
Which gives me the result I'm looking for (just showing one result for this example)
[job_id] => 551
[jid] => 551
[sum] => 124440.97024536133
[marketing_campaign] => Roto Rooter
Table 2:
SELECT DISTINCT JB.job_id, AP.job_id, SUM(price) AS price, AP.vendor FROM job_tbl JB
LEFT JOIN ap_tbl AP ON JB.job_id = AP.job_id WHERE AP.marked_as_paid = 1 AND
AP.activity = 'Commission' AND FROM_UNIXTIME(AP.payment_date,'%Y') = YEAR(NOW()) GROUP
BY vendor ORDER BY price DESC LIMIT 10
Which gives me the results I'm looking for...
[job_id] => 551
[price] => 5700
[vendor] => 436
Now when I join them I get a different result
SELECT DISTINCT JB.job_id, AR.job_id as arid, AP.job_id as apid,
SUM(AR.ar_payment_amount) AS sum, SUM(AP.price) AS price, JB.marketing_campaign FROM
job_tbl JB LEFT JOIN ap_tbl AP ON JB.job_id = AP.job_id AND AP.marked_as_paid = 1 AND
AP.activity = 'Commission' LEFT JOIN ar_payment_tbl AR ON JB.job_id = AR.job_id AND
FROM_UNIXTIME(AR.ar_payment_date,'%Y') = YEAR(NOW()) WHERE AP.price != '' AND
AR.ar_payment_amount != '' AND JB.marketing_campaign != '' GROUP BY
JB.marketing_campaign ORDER BY sum DESC LIMIT 10
Here is the result I get
[job_id] => 551
[arid] => 551
[apid] => 551
[sum] => 130507.02011108398
[price] => 8200
[marketing_campaign] => Roto Rooter
and here is what the result should be
[job_id] => 551
[arid] => 551
[apid] => 551
[sum] => 124440.97024536133
[price] => 5700
[marketing_campaign] => Roto Rooter
Any help would be appreciated and this project was due last Friday! ;-)

Try this - I think there were a Problem with the JOIN Condition and the GROUP BY:
SELECT
JB.job_id,
AR.job_id as arid,
AP.job_id as apid,
JB.marketing_campaign,
SUM(AR.ar_payment_amount) AS `sum`,
SUM(AP.price) AS price
FROM job_tbl JB
LEFT JOIN ap_tbl AP ON JB.job_id = AP.job_id
LEFT JOIN ar_payment_tbl AR ON JB.job_id = AR.job_id
WHERE
AP.marked_as_paid = 1
AND AP.activity = 'Commission'
AND FROM_UNIXTIME(AR.ar_payment_date,'%Y') = YEAR(NOW())
AND AP.price != ''
AND AR.ar_payment_amount != ''
AND JB.marketing_campaign != ''
GROUP BY 1,2,3,4
ORDER BY `sum` DESC LIMIT 10

Related

How to improve performance with an ORDER BY clause

I have a query that is reading through approximately 2.4m rows of data.
The query itself is running well but the ORDER BY clause is causing performance issues. If I remove the ORDER BY the query takes 0.03 seconds to execute. With the ORDER BY it can take 4.5 to 5 seconds.
Is there anyway I an optimise this query further? Indexes have been added so that isn't a solution.
EDIT 1 -
This query is a shortened version of a much bigger PDO query so I think the join is necessary. You can see the main query at the bottom of this post.
SELECT t.processing_time, t.paymentType, t.status, t.merchantTransactionId, t.paymentBrand, t.amount, t.currency, t.code, t.holder, t.bin, t.last4Digits, t.recurringType, m.name AS merchant, c.name AS channel, concat(UPPER(SUBSTRING(trim(sp.status_description),1,1)), lower(SUBSTRING(trim(sp.status_description),2))) as status_description
FROM transactionsV2 t
JOIN channels c
ON t.entityId = c.uuid
JOIN merchants m
ON m.uuid = c.sender
JOIN status_payments sp
ON t.code = sp.status_code
JOIN (
SELECT t.id, t.processing_time FROM transactionsV2 t
JOIN channels c ON t.entityId = c.uuid
JOIN merchants m ON m.uuid = c.sender
WHERE (t.processing_time >= "2018-11-08 00:00:00")
AND (t.processing_time <= "2018-11-12 23:59:59")
ORDER BY t.processing_time DESC
LIMIT 1000
) t2
ON t.id = t2.id
WHERE t.status = 1
$transactions = DB::connection('mysql2')->select(DB::raw("SELECT t.processing_time, t.paymentType, t.status, t.merchantTransactionId, t.paymentBrand, t.amount, t.currency, t.code, t.holder, t.bin, t.last4Digits, t.recurringType, m.name AS merchant, c.name AS channel, concat(UPPER(SUBSTRING(trim(sp.status_description),1,1)), lower(SUBSTRING(trim(sp.status_description),2))) as status_description
FROM transactionsV2 t
JOIN channels c
ON t.entityId = c.uuid
JOIN merchants m
ON m.uuid = c.sender
JOIN status_payments sp
ON t.code = sp.status_code
JOIN (
SELECT t.id, t.processing_time FROM transactionsV2 t
JOIN channels c ON t.entityId = c.uuid
JOIN merchants m ON m.uuid = c.sender
WHERE (t.processing_time >= :insTs1)
AND (t.processing_time <= :insTs2)
AND (:merchant1 IS NULL OR m.name LIKE :merchant2)
AND (:channel1 IS NULL OR c.name LIKE :channel2)
ORDER BY t.processing_time DESC
LIMIT 1000
) t2
ON t.id = t2.id
WHERE (:status1 IS NULL OR t.status = :status2)
AND (:holder1 IS NULL OR holder LIKE :holder2)
AND (:paymentType1 IS NULL OR t.paymentType IN (".$paymentType."))
AND (:merchantTransactionId1 IS NULL OR merchantTransactionId LIKE :merchantTransactionId2)
AND (:paymentBrand1 IS NULL OR paymentBrand LIKE :paymentBrand2)
AND (:amount1 IS NULL OR amount = :amount2)
AND (:recurringType1 IS NULL OR t.recurringType = :recurringType2)"),
['status1' => $search->searchCriteria['status'],
'status2' => $search->searchCriteria['status'],
'holder1' => $search->searchCriteria['holder'],
'holder2' => '%'.$search->searchCriteria['holder'].'%',
'paymentType1' => $paymentType,
'merchantTransactionId1' => $search->searchCriteria['merchantTransactionId'],
'merchantTransactionId2' => '%'.$search->searchCriteria['merchantTransactionId'].'%',
'paymentBrand1' => $search->searchCriteria['paymentBrand'],
'paymentBrand2' => '%'.$search->searchCriteria['paymentBrand'].'%',
'amount1' => $search->searchCriteria['amount'],
'amount2' => $search->searchCriteria['amount'],
'recurringType1' => $search->searchCriteria['recurringType'],
'recurringType2' => $search->searchCriteria['recurringType'],
'merchant1' => $search->searchCriteria['merchant'],
'merchant2' => '%'.$search->searchCriteria['merchant'].'%',
'channel1' => $search->searchCriteria['channel'],
'channel2' => '%'.$search->searchCriteria['channel'].'%',
'insTs1' => $search->searchCriteria['fromDate'] . ' 00:00:00',
'insTs2' => $search->searchCriteria['toDate'] . ' 23:59:59']);
Perhaps I'm missing something, but I don't see that the subquery requires the joins. Does this suffice?
SELECT t.id, t.processing_time
FROM transactionsV2 t
WHERE t.processing_time >= '2018-11-08' AND
t.processing_time <= '2018-11-13'
ORDER BY t.processing_time DESC
LIMIT 1000
If so, an index on transactionsV2(processing_time) would help (assuming that it is not a view).
I believe that the subquery is redundant since it is independent subquery and you are doing the join according to a primary key (transactionsV2.id). You can simply use
SELECT t.processing_time,
t.paymentType,
t.status,
t.merchantTransactionId,
t.paymentBrand,
t.amount,
t.currency,
t.code,
t.holder,
t.bin,
t.last4Digits,
t.recurringType,
m.name AS merchant,
c.name AS channel,
concat(UPPER(SUBSTRING(trim(sp.status_description),1,1)),
lower(SUBSTRING(trim(sp.status_description),2))) as status_description,
row_number() over ()
FROM transactionsV2 t
JOIN channels c ON t.entityId = c.uuid
JOIN merchants m ON m.uuid = c.sender
WHERE (t.processing_time >= "2018-11-08 00:00:00") AND (t.processing_time <= "2018-11-12 23:59:59") and t.status = 1
ORDER BY t.processing_time DESC
LIMIT 1000

Using mysql LEFT Join DESC order is not working

Here i have two table,i have to join these two table and i have to get the plan details, i tried but is not happening, here is my code
user_info
id fullName
1 Arun
2 Sarvan
user_active_plan
id userId planName
1 1 Free Plan
2 1 Cool Plan
3 2 Free Plan
contact_property
id userId contactProperty
1 1 A
2 1 B
3 2 C
Here user_info(tablename) id (column name) i am using foreign key of user_active_plan(table name) userId(column name)
I want get the latest plan based on userId,So i am using desc order , but it is not coming expected results:
$sql = "SELECT a.fullName,b.*FROM user_info a LEFT JOIN user_active_plan b ON a.id = b.userId GROUP BY b.userId ORDER BY id DESC";
$result = $this->GetJoinRecord($sql);
print_r($result);
I am getting the following incorrect results:
Array
(
[0] => Array
(
[fullName] => Sarvan
[id] => 3
[userId] => 2
[planName] => Free Plan
)
[1] => Array
(
[fullName] => Arun
[id] => 1
[userId] => 1
[planName] => Free Plan
)
)
)
I was expecting the following:
Array
(
[0] => Array
(
[fullName] => Sarvan
[id] => 3
[userId] => 2
[planName] => Free Plan
)
[1] => Array
(
[fullName] => Arun
[id] => 2
[userId] => 1
[planName] => Coll Plan
)
)
)
Updated Expected Answer
Array
(
[0] => Array
(
[userId] => 1
[fullName] => Arun
[planId] => 2
[planName] => Cool Plan
[contactCount] => 2
)
[1] => Array
(
[userId] => 2
[fullName] => Sarvan
[planId] => 3
[planName] => Free Pla1
[contactCount] => 1
)
)
You can get the latest plan with a simple subquery, no need for grouping. The count of contacts can be done with a simple grouping:
SELECT u.id AS userId, u.fullName, p.id AS planId, p.planName, COUNT(c.userId) AS contactCount
FROM user_info u
LEFT JOIN user_active_plan p ON u.id = p.userId
LEFT JOIN contact_property c ON u.id = c.userId
WHERE p.id = (SELECT id
FROM user_active_plan
WHERE userId = u.id
ORDER BY id DESC
LIMIT 1)
GROUP BY c.userId;
You can also move the condition from the WHERE clause to the join:
SELECT u.id AS userId, u.fullName, p.id AS planId, p.planName, COUNT(c.userId) AS contactCount
FROM user_info u
LEFT JOIN user_active_plan p ON u.id = p.userId
AND p.id = (SELECT id
FROM user_active_plan
WHERE userId = u.id
ORDER BY id DESC
LIMIT 1)
LEFT JOIN contact_property c ON u.id = c.userId
GROUP BY c.userId;

Unknown column where clause

I have following DQL query
SELECT
ps.id,
MAX(ps.dueDate) as due_date,
u.firstName as first_name,
u.lastName as last_name,
u.email,
IDENTITY(ps.loanApplication) as loan_application_id,
DATE_DIFF(MAX(ps.dueDate), CURRENT_DATE()) as diff
FROM
Loan\Entity\PaymentSchedule ps
LEFT JOIN
ps.paymentType pt
LEFT JOIN
ps.loanApplication la
LEFT JOIN
la.status s
LEFT JOIN
la.user u
WHERE
pt.slug != :paymentSlug AND s.keyIdentifier = :status AND diff = 14
GROUP BY
ps.loanApplication
Which translates to following SQL query
SELECT
p0_.id AS id_0,
MAX(p0_.due_date) AS sclr_1,
u1_.first_name AS first_name_2,
u1_.last_name AS last_name_3,
u1_.email AS email_4,
p0_.loan_application_id AS sclr_5,
DATEDIFF(MAX(p0_.due_date), CURRENT_DATE) AS sclr_6
FROM
payment_schedule p0_
LEFT JOIN
payment_type p2_ ON p0_.payment_type_id = p2_.id
LEFT JOIN
loan_application l3_ ON p0_.loan_application_id = l3_.id
LEFT JOIN
loan_application_status l4_ ON l3_.loan_application_status_id = l4_.id
LEFT JOIN
user u1_ ON l3_.user_id = u1_.id
WHERE
p2_.slug <> ? AND l4_.key_identifier = ? AND sclr_6 = 14
GROUP BY
p0_.loan_application_id
This gives me following error
======================================================================
PDOException
SQLSTATE[42S22]: Column not found: 1054 Unknown column 'sclr_6' in 'where clause'
----------------------------------------------------------------------
When i replace WHERE condition
WHERE pt.slug != :paymentSlug AND s.keyIdentifier = :status AND diff = 14
With
WHERE pt.slug != :paymentSlug AND s.keyIdentifier = :status
It works perfectly and displays me correct record, i also tried following WHERE condition
WHERE pt.slug != :paymentSlug AND s.keyIdentifier = :status AND DATE_DIFF(MAX(ps.dueDate), CURRENT_DATE()) = :days_diff
WHERE pt.slug != :paymentSlug AND s.keyIdentifier = :status HAVING (DATE_DIFF(MAX(ps.dueDate), CURRENT_DATE())) = :days_diff
Above WHERE does not work as well, what am i missing here ?
Thanks.
If you want to use the alias in your WHERE clause you need a sub-select.
select *
from
(SELECT
p0_.id AS id_0,
MAX(p0_.due_date) AS sclr_1,
u1_.first_name AS first_name_2,
u1_.last_name AS last_name_3,
u1_.email AS email_4,
p0_.loan_application_id AS sclr_5,
DATEDIFF(MAX(p0_.due_date), CURRENT_DATE) AS sclr_6
FROM
payment_schedule p0_
LEFT JOIN
payment_type p2_ ON p0_.payment_type_id = p2_.id
LEFT JOIN
loan_application l3_ ON p0_.loan_application_id = l3_.id
LEFT JOIN
loan_application_status l4_ ON l3_.loan_application_status_id = l4_.id
LEFT JOIN
user u1_ ON l3_.user_id = u1_.id
) A
WHERE
slug <> ? AND key_identifier = ? AND sclr_6 = 14
This is how query is logically processed
FROM clause
WHERE clause
SELECT clause
GROUP BY clause
HAVING clause
ORDER BY clause
Since Where comes before Select you cannot use alias name in Where clause
You cannot use an alias (on the final result fields) in the WHERE clause; however, at least with MySQL, you may use a HAVING clause without needing a GROUP BY.
The expression you are using is the result of an aggregation. Replace add a having clause so the query looks like;
SELECT . . .
WHERE p2_.slug <> ? AND l4_.key_identifier = ?
GROUP BY p0_.loan_application_id
HAVING sclr_6 = 14
Note that date_diff() is not a function in MySQL. You intend datediff().

getting row counts from multiple tables

I am trying to get row counts returned for different tables based on user_id value. users is a table of all users with a unique column of user_id. All other tables have a corresponding user_id column to join on it with.
I would think this would be fairly easy, but for some reason I cannot get the counts to return right.
What I want to accomplish is alerts = ? and locations = ? where ? is the total number of rows in that table where user_id = 1,2,3,4,5,6,7, or 8.
$stmt = $db->prepare("
SELECT
count(t_alerts.user_id) as alerts,
count(t_locations.user_id) as locations
FROM users
LEFT JOIN
(SELECT user_id
FROM alert_logs
WHERE alert_logs.event_title LIKE '%blocked%'
) as t_alerts
on t_alerts.user_id = users.user_id
LEFT JOIN
(SELECT user_id
FROM location_logs
) as t_locations
on t_locations.user_id = users.user_id
WHERE users.user_id IN(1,2,3,4,5,6,7,8)
");
$stmt->execute();
//get results
$results = $stmt->fetch(PDO::FETCH_ASSOC);
EDIT :
A bit of a modification to eliminate the need of supplying the IN values... I use this in some other queries to only get results for 'active' users...
$stmt = $db->prepare("
SELECT
(SELECT COUNT(*)
FROM alert_logs al
WHERE event_title LIKE '%blocked%' AND al.user_id = u.user_id
) as alerts,
(SELECT COUNT(*)
FROM location_logs ll
WHERE ll.user_id = u.user_id
) as locations
FROM
( SELECT account_id, computer_id
FROM computers
WHERE account_id = :account_id
ORDER BY computer_id ASC LIMIT 0, :licenses
) as c
INNER JOIN users as u
on u.computer_id = c.computer_id
");
$binding = array(
'account_id' => $_SESSION['user']['account_id'],
'licenses' => $_SESSION['user']['licenses']
);
$stmt->execute($binding);
I am running into the problem mentioned below with this statement... it is returning an array of counts per user rather than all counts combined into one result.
Array
(
[0] => Array
(
[alerts] => 6
[locations] => 4
)
[1] => Array
(
[alerts] => 3
[locations] => 5
)
[2] => Array
(
[alerts] => 1
[locations] => 4
)
[3] => Array
(
[alerts] => 0
[locations] => 0
)
[4] => Array
(
[alerts] => 0
[locations] => 0
)
[5] => Array
(
[alerts] => 0
[locations] => 0
)
[6] => Array
(
[alerts] => 0
[locations] => 0
)
[7] => Array
(
[alerts] => 0
[locations] => 0
)
)
What can I do to 'combine' results?
The problem is that the alerts are multiplying with the locations. So, if there are 10 alerts and 5 locations, the result is 50 rows. That is what gets counted.
The easy solution is to use count(distinct):
SELECT
count(distinct t_alerts.user_id) as alerts,
count(distinct t_locations.user_id) as locations
. . .
The better solution is often to use a subquery to do the counting along each dimension, and then join the results together.
EDIT:
In your case, nested subqueries in the select might be the best approach, because the query filters on users:
SELECT (SELECT COUNT(*)
FROM alert_logs al
WHERE event_title LIKE '%blocked%' AND
al.user_id = u.user_id
) as alerts,
(SELECT COUNT(*)
FROM location_logs ll
WHERE ll.user_id = u.user_id
) as locations
FROM users u
WHERE u.user_id IN (1,2,3,4,5,6,7,8)
EDIT II:
I see, there is no group by at the end of your query. In that case, you might as well do:
SELECT (SELECT COUNT(*)
FROM alert_logs al
WHERE event_title LIKE '%blocked%' AND
al.user_id IN (1,2,3,4,5,6,7,8)
) as alerts,
(SELECT COUNT(*)
FROM location_logs ll
WHERE ll.user_id IN (1,2,3,4,5,6,7,8)
) as locations;
You don't need the users table at all.

MySQL select two columns multiple name value pair

I need help about generating query for multiple column.
part of my tbl_advert_specific_fields_values table look like:
id advert_id field_name field_value
1 654 t1_sqft 50
2 655 t1_yearbuilt 1999
3 1521 t2_doorcount 5
4 656 t1_yearbuilt 2001
5 656 t1_sqft 29
6 654 t1_yearbuilt 2004
SELECT p.*, p.id AS id, p.title AS title, usr.id as advert_user_id,
p.street_num, p.street,c.icon AS cat_icon,c.title AS cat_title,c.title AS cat_title,
p.description as description,
countries.title as country_name,
states.title as state_name,
date_FORMAT(p.created, '%Y-%m-%d') as fcreated
FROM tbl AS p
LEFT JOIN tbl_advertmid AS pm ON pm.advert_id = p.id
INNER JOIN tbl_usermid AS am ON am.advert_id = p.id
LEFT JOIN tbl_users AS usr ON usr.id = am.user_id
INNER JOIN tbl_categories AS c ON c.id = pm.cat_id
INNER JOIN tbl_advert_specific_fields_values AS asfv ON asfv.advert_id = p.id
LEFT JOIN tbl_countries AS countries ON countries.id = p.country
LEFT JOIN tbl_states AS states ON states.id = p.locstate
WHERE p.published = 1 AND p.approved = 1 AND c.published = 1
AND (asfv.field_name = 't1_yearbuilt'
AND CONVERT(asfv.field_value,SIGNED) <= 2004 )
AND (asfv.field_name = 't1_sqft'
AND CONVERT(asfv.field_value,SIGNED) <= 50)
AND p.price <= 10174945 AND (p.advert_type_id = 1)
AND (c.id = 43 OR c.parent = 43)
GROUP BY p.id
ORDER BY p.price DESC
ok, the problem is in this asfv query part that are generated dynamically. It belong to objects which represent adverts by its specific fields. asfv is actually advert_specific_fields_values table (table name say all about it).
Without part:
AND (asfv.field_name = 't1_yearbuilt'
AND CONVERT(asfv.field_value,SIGNED) <= 2004 )
AND (asfv.field_name = 't1_sqft'
AND CONVERT(asfv.field_value,SIGNED) <= 50)
query return all adverts that belong on advert_type_id and price of them are less than 10.174.945,00 €.
All what I need is query update that return only adverts, for example t1_yearbuilt less than 2005 and t1_sqft less than 51 (advert_id => 654,656).
I also need query for values between for example t1_sqft >=30 AND t1_sqft <=50 (advert_id => 654).
Can anybody know how, update this query?
TNX