JOIN on multiple tables giving duplicate records - MySql

JOIN on multiple tables giving duplicate records - MySql - mysql

Here is list of my tables and necessary columns
users u .
screen_name,
country,
status
twitter_users_relationship tf. This table have multiple target_screen_name for each screen_name.
screen_name,
target_screen_name,
target_country,
follow_status
user_twitter_action_map ta
screen_name,
action_name,
action_status
user_targeted_countries utc .This table have multiple countries for each screen_name
screen_name,
country_name
I want to get all target_screen_name from twitter_users_relationship that have matched target_country with u.country or utc.country_name
My query so far
SELECT u.screen_name,
u.country,
tf.target_screen_name,
tf.target_country,
ta.action_name,
ta.action_status,
utc.country_name
FROM users u
LEFT JOIN twitter_users_relationship tf
ON u.screen_name=tf.screen_name
LEFT JOIN user_twitter_action_map ta
ON u.screen_name=ta.screen_name
AND ta.action_name='follow'
AND ta.action_status='active'
LEFT JOIN user_targeted_countries utc
ON u.screen_name= utc.screen_name
WHERE u.status = 'active'
AND tf.follow_status = 'pending'
AND tf.target_country != ''
AND tf.target_country IS NOT NULL
AND ( utc.country_name=tf.target_country OR u.country=tf.target_country)
AND u.screen_name = 'my_screen_name';
But this query giving me duplicate record for each entry of countries in user_targeted_countries. If there are 3 counties in user_targeted_countries the it will return 3 duplicate records.
Please let me know what JOIN I need to use with user_targeted_countries to get desired results.
u.country can be different than countries in utc.country_name
UPDATE -
If I removes OR u.country=tf.target_country from the WHERE clause then I get all the matched target_screen_name without duplicate. But I am not sure how to get all those records also that matches with u.country=tf.target_country ?

Depends on the business logic required ..
First, regardless to the question, your query is wrong(Either the LEFT JOIN or the conditions) . When using LEFT JOIN , conditions on the right table should only be specified in the ON clause, which means you need to move all the conditions on tf. and utc. to the ON clause.
Secondly, you can use a GROUP BY clause and choose one of the utc.country_name (different answers will be if you want a specific one, if it doesn't matter, use MAX() on this column).

Related

MySQL - How to get one of the repeated records given a condition in SQL?

I have the next results from a query. I did this:
Where the user "Adriana Smith" with ID 6 is repeated because she has different contract dates, to do that I did a left join from table bo_users to bo_users_contracts (1:m One to Many Relation). The query is below:
SELECT bo_users.ID, bo_users.display_name, COALESCE (bo_users_contracts.contract_start_date,'-') AS contract_start_date, COALESCE (bo_users_contracts.contract_end_date, '-') AS contract_end_date, COALESCE (bo_users_contracts.current,'-') AS current
FROM bo_users
LEFT JOIN bo_users_contracts ON bo_users.ID = bo_users_contracts.bo_users_id
LEFT JOIN bo_usermeta ON bo_users.ID = bo_usermeta.user_id
WHERE (bo_usermeta.meta_key = 'role' AND bo_usermeta.meta_value = 'member')
But I want to get all users, but from user Adriana I just want to get the occurrence where "current" column = 1.
So the final result would be the 3 user's records:
Alejandro, Rhonda and Adriana (with "current" = 1)
Thank you!

Since you want to limit on a table being outer joined, the limit should be placed on the join itself so the all records from bo_users is retained. (as indicated desired by the outer join)
Essentially the limit is applied before the join so the unmatched records from BO_users to bo_users_contracts are kept. If applied after the join in a where clause the records from BO_user without a matching record would have a null value for current and thus be excluded when the current=1 filter is applied.
In this example the only values which should be in the where would be from table BO_USERS.
I'd even move the bo_usermeta filters to the join or you may lose bo_users; or the left join on the 3rd table should be an inner join.
SELECT bo_users.ID
, bo_users.display_name
, COALESCE (bo_users_contracts.contract_start_date,'-') AS contract_start_date
, COALESCE (bo_users_contracts.contract_end_date, '-') AS contract_end_date
, COALESCE (bo_users_contracts.current,'-') AS current
FROM bo_users
LEFT JOIN bo_users_contracts
ON bo_users.ID = bo_users_contracts.bo_users_id
and bo_users_contracts.current = 1
LEFT JOIN bo_usermeta --This is suspect
ON bo_users.ID = bo_usermeta.user_id
WHERE (bo_usermeta.meta_key = 'role' --this is suspect
AND bo_usermeta.meta_value = 'member') --this is suspect
The lines reading this is suspect are that way because you have a left join which means you want all users from bo_users.. However if a user doesn't have a meta_key or meta_value defined, they would be eliminated. Either change the join to an inner join or move the where clause limits to the join. I indicate this as you're query is "inconsistent" in it's definition leading to ambiguity when later maintained.

MySQL - Trying to show results for rows that have 0 records...across 3 columns

There's a lot of Q&A out there for how to make MySQL show results for rows that have 0 records, but they all involve 1-2 tables/fields at most.
I'm trying to achieve the same ends, but across 3 fields, and I just can't seem to get it.
Here's what I've hacked together:
SELECT circuit.circuit_name, county.county_name, result.adr_result, count( result.adr_result ) AS num_results
FROM
(
SELECT cases.case_id, cases.county_id, cases.result_id
FROM cases
WHERE cases.status_id <> "2"
) q1
RIGHT JOIN county ON q1.county_id = county.county_id
RIGHT JOIN circuit ON county.circuit_id = circuit.circuit_id
RIGHT JOIN result ON q1.result_id = result.result_id
GROUP BY adr_result, circuit_name, county_name
ORDER BY circuit_name, county_name, adr_result
What I need to see is a list of ALL circuits in the first column, a list of ALL counties per circuit in the second column, a list of ALL possible adr_result entries for each county (they're the same for every county) in the third column, and then the respective count for the circuit/county/result combination-- even if it is 0. I've tried every combination of left, right and inner join (I know inner is definitely not the solution, but I'm frustrated) and just can't see where I'm going wrong.
Any help would be appreciated!

Here is a start. I can't follow your problem statement completely. For instance, what is the purposes of the cases table? None the less, when you say "ALL" records for each of those tables, I interpret it as a Cartesian product - which is implemented through the derived table in the FROM clause (notice the lack of the JOIN in that clause)
SELECT everthingjoin.circuit_name
, everthingjoin.county_name
, everthingjoin.adr_result
, COUNT(result.adr_result) AS num_results
FROM
(SELECT circuit.circuit_name, county.county_name, result.adr_result,
FROM circuit
JOIN county
JOIN result) AS everthingjoin
LEFT JOIN cases
ON cases.status_id <> "2"
AND cases.county_id = everthingjoin.county_id
LEFT JOIN circuit
ON everthingjoin.circuit_id = circuit.circuit_id
LEFT JOIN result
ON cases.result_id = result.result_id
GROUP BY adr_result, circuit_name, county_name
ORDER BY circuit_name, county_name, adr_result

try this, see if it provides some ideas:
SELECT
circuit.circuit_name
, county.county_name
, result.adr_result
, ISNULL(COUNT(result.*)) AS num_results
, COUNT(DISTINCT result.adr_result) AS num_distinct_results
FROM cases
LEFT JOIN county
ON cases.county_id = county.county_id
LEFT JOIN circuit
ON county.circuit_id = circuit.circuit_id
LEFT JOIN result
ON cases.result_id = result.result_id
WHERE cases.status_id <> "2"
GROUP BY
circuit.circuit_name
, county.county_name
, result.adr_result
ORDER BY
circuit_name, county_name, adr_result

MySql query runs very slow(actually never gives output) without where clause

I have a mysql query and it works fine when i use where clause, but when i donot use
where clause it gone and never gives the output and finally timeout.
Actually i have used Explain command to check the performance of the query and in both cases the Explain gives the same number of rows used in joining.
I have attached the image of output got with Explain command.
Below is the query.
I couldn't figure whats the problem here.
Any help is highly appreciated.
Thanks.
SELECT
MCI.CLIENT_ID AS CLIENT_ID, MCI.NAME AS CLIENT_NAME, MCI.PRIMARY_CONTACT AS CLIENT_PRIMARY_CONTACT,
MCI.ADDED_BY AS SP_ID, CONCAT(MUD_SP.FIRST_NAME, ' ', MUD_SP.LAST_NAME) AS SP_NAME,
MCI.FK_PROSPECT_ID AS PROSPECT_ID, MCI.DATE_ADDED AS ADDED_ON,
(SELECT GROUP_CONCAT(LT.TAG_TEXT SEPARATOR ', ')
FROM LK_TAG LT
INNER JOIN M_OBJECT_TAG_MAPPING MOTM
ON LT.PK_ID = MOTM.FK_TAG_ID
WHERE MOTM.FK_OBJECT_ID = MCI.FK_PROSPECT_ID
AND MOTM.OBJECT_TYPE = 1
AND MOTM.IS_ACTIVE = 1
) AS TAGS,
IFNULL(SUM(GET_DIGITS(MMR.RCP_AMOUNT)), 0) AS REVENUE_SO_FAR,
IFNULL(SUM(GET_DIGITS(MMR.RCP_RUPEES)), 0) AS REVENUE_INR,
COUNT(DISTINCT PMI_MONTHLY.PROJECT_ID) AS MONTHLY,
COUNT(DISTINCT PMI_FIXED.PROJECT_ID) AS FIXED,
COUNT(DISTINCT PMI_HOURLY.PROJECT_ID) AS HOURLY,
COUNT(DISTINCT PMI_ANNUAL.PROJECT_ID) AS ANNUAL,
COUNT(DISTINCT PMI_CURRENTLY_RUNNING.PROJECT_ID) AS CURRENTLY_RUNNING_PROJECTS,
COUNT(DISTINCT PMI_YET_TO_START.PROJECT_ID) AS YET_TO_START_PROJECTS,
COUNT(DISTINCT PMI_TECH_SALES_CLOSED.PROJECT_ID) AS TECH_SALES_CLOSED_PROJECTS
FROM
M_CLIENT_INFO MCI
INNER JOIN M_USER_DETAILS MUD_SP
ON MCI.ADDED_BY = MUD_SP.PK_ID
LEFT OUTER JOIN M_MONTH_RECEIPT MMR
ON MMR.CLIENT_ID = MCI.CLIENT_ID
LEFT OUTER JOIN M_PROJECT_INFO PMI_FIXED
ON PMI_FIXED.CLIENT_ID = MCI.CLIENT_ID AND PMI_FIXED.PROJECT_TYPE = 1
LEFT OUTER JOIN M_PROJECT_INFO PMI_MONTHLY
ON PMI_MONTHLY.CLIENT_ID = MCI.CLIENT_ID AND PMI_MONTHLY.PROJECT_TYPE = 2
LEFT OUTER JOIN M_PROJECT_INFO PMI_HOURLY
ON PMI_HOURLY.CLIENT_ID = MCI.CLIENT_ID AND PMI_HOURLY.PROJECT_TYPE = 3
LEFT OUTER JOIN M_PROJECT_INFO PMI_ANNUAL
ON PMI_ANNUAL.CLIENT_ID = MCI.CLIENT_ID AND PMI_ANNUAL.PROJECT_TYPE = 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_CURRENTLY_RUNNING
ON PMI_CURRENTLY_RUNNING.CLIENT_ID = MCI.CLIENT_ID AND PMI_CURRENTLY_RUNNING.STATUS = 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_YET_TO_START
ON PMI_YET_TO_START.CLIENT_ID = MCI.CLIENT_ID AND PMI_YET_TO_START.STATUS < 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_TECH_SALES_CLOSED
ON PMI_TECH_SALES_CLOSED.CLIENT_ID = MCI.CLIENT_ID AND PMI_TECH_SALES_CLOSED.STATUS > 4
WHERE YEAR(MCI.DATE_ADDED) = '2012'
GROUP BY MCI.CLIENT_ID ORDER BY CLIENT_NAME ASC

Yes, as many people have said, the key is that when you have the where clause, mysql engine filters the table M_CLIENT_INFO --probably drammatically--.
A similar result as removing the where clause is to to add this where clause:
where 1 = 1
You will see that the performance is degraded also because mysql will try to get all the data.

Remove the where clause and all columns from select and add a count to see how many records you get. If it is reasonable, say up to 10k, then do the following,
put back the select columns related to M_CLIENT_INFO
do not include the nested one "TAGS"
remove all your joins
run your query without where clause and gradually include the joins
this way you'll find out when the timeout is caused.

I would try the following. First, MySQL has a keyword "STRAIGHT_JOIN" which tells the optimizer to do the query in the table order you've specified. Since all you left-joins are child-related (like a lookup table), you don't want MySQL to try and interpret one of those as a primary basis of the query.
SELECT STRAIGHT_JOIN ... rest of query.
Next, your M_PROJECT_INFO table, I dont know how many columns of data are out there, but you appear to be concentrating on just a few columns on your DISTINCT aggregates. I would make sure you have a covering index on these elements to help the query via an index on
( Client_ID, Project_Type, Status, Project_ID )
This way the engine can apply the criteria and get the distinct all out of the index instead of having to go back to the raw data pages for the query.
Third, your M_CLIENT_INFO table. Ensure that has an index on both your criteria, group by AND your Order By, and change your order by from the aliased "CLIENT_NAME" to the actual column of the SQL table so it matches the index
( Date_Added, Client_ID, Name )
I have "name" in ticks as it is also a reserved word and helps clarify the column, not the keyword.
Next, the WHERE clause. Whenever you apply a function to an indexed column name, it doesn't work the greatest, especially on date/time fields... You might want to change your where clause to
WHERE MCI.Date_Added between '2012-01-01' and '2012-12-31 23:59:59'
so the BETWEEN range is showing the entire year and the index can better be utilized.
Finally, if the above do not help, I would consider splitting your query some. The GROUP_CONCACT inline select for the TAGS might be a bit of a killer for you. You might want to have all the distinct elements first for the grouping per client, THEN get those details.... Something like
select
PQ.*,
group_concat(...) tags
from
( the entire primary part of the query ) as PQ
Left join yourGroupConcatTableBasis on key columns

COUNT evaluate to zero if no matching records

Take the following:
SELECT
Count(a.record_id) AS newrecruits
,a.studyrecord_id
FROM
visits AS a
INNER JOIN
(
SELECT
record_id
, MAX(modtime) AS latest
FROM
visits
GROUP BY
record_id
) AS b
ON (a.record_id = b.record_id) AND (a.modtime = b.latest)
WHERE (((a.visit_type_id)=1))
GROUP BY a.studyrecord_id;
I want to amend the COUNT part to display a zero if there are no records since I assume COUNT will evaluate to Null.
I have tried the following but still get no results:
IIF(ISNULL(COUNT(a.record_id)),0,COUNT(a.record_id)) AS newrecruits
Is this an issue because the join is on record_id? I tried changing the INNER to LEFT but also received no results.
Q
How do I get the above to evaluate to zero if there are no records matching the criteria?
Edit:
To give a little detail to the reasoning.
The studies table contains a field called 'original_recruits' based on activity before use of the database.
The visits tables tracks new_recruits (Count of records for each study).
I combine these in another query (original_recruits + new_recruits)- If there have been no new recruits I still need to display the original_recruits so if there are no records I need it to evalulate to zero instead of null so the final sum still works.

It seems like you want to count records by StudyRecords.
If you need a count of zero when you have no records, you need to join to a table named StudyRecords.
Did you have one? Else this is a nonsense to ask for rows when you don't have rows!
Let's suppose the StudyRecords exists, then the query should look like something like this :
SELECT
Count(a.record_id) AS newrecruits -- a.record_id will be null if there is zero count for a studyrecord, else will contain the id
sr.Id
FROM
visits AS a
INNER JOIN
(
SELECT
record_id
, MAX(modtime) AS latest
FROM
visits
GROUP BY
record_id
) AS b
ON (a.record_id = b.record_id) AND (a.modtime = b.latest)
LEFT OUTER JOIN studyrecord sr
ON sr.Id = a.studyrecord_id
WHERE a.visit_type_id = 1
GROUP BY sr.Id

I solved the problem by amending the final query where I display the result of combining the original and new recruits to include the IIF there.
SELECT
a.*
, IIF(IsNull([totalrecruits]),consents,totalrecruits)/a.target AS prog
, IIf(IsNull([totalrecruits]),consents,totalrecruits) AS trecruits
FROM
q_latest_studies AS a
LEFT JOIN q_totalrecruitment AS b
ON a.studyrecord_id=b.studyrecord_id
;

Getting last element from Group By

I have this query...
$sQuery = "
SELECT SQL_CALC_FOUND_ROWS ".str_replace(" , ", " ", implode(", ", $aColumns))."
FROM dominios left join datas on dominios.id_dominio=datas.id_dominio
left join dnss on dominios.id_dominio=dnss.id_dominio
left join entidades_gestoras on dominios.id_dominio=entidades_gestoras.id_dominio
left join estados on dominios.id_dominio=estados.id_dominio
left join ips on dominios.id_dominio=ips.id_dominio
left join quantidade_dnss on dominios.id_dominio=quantidade_dnss.id_dominio
left join responsaveis_tecnicos on dominios.id_dominio=responsaveis_tecnicos.id_dominio
left join titulares on dominios.id_dominio=titulares.id_dominio
WHERE dominios.estado not like 2 and dominios.estado not like 0 AND data_expiracao > '".date("Ymd")."' $sWhere $where
GROUP BY dominio
$sOrder
$sLimit
";
It returns me the results I 'need'...
But the Group By, it show me the first result that appear on the database, and I needed the last...
How can I do this? :s
Edited
This is the final query, without those variables
SELECT SQL_CALC_FOUND_ROWS `datas`.`data_insercao`, `datas`.`data_expiracao`, `datas`.`data_registo`,
`dominios`.`dominio`,
`titulares`.`nome`, `titulares`.`morada`, `titulares`.`email`, `titulares`.`localidade`, `titulares`.`cod_postal`,
`entidades_gestoras`.`nome`, `entidades_gestoras`.`email`,
`responsaveis_tecnicos`.`nome`, `responsaveis_tecnicos`.`email`,
`ips`.`ip`, `dominios`.`id_dominio` FROM dominios left join datas on dominios.id_dominio=datas.id_dominio
left join dnss on dominios.id_dominio=dnss.id_dominio
left join entidades_gestoras on dominios.id_dominio=entidades_gestoras.id_dominio
left join estados on dominios.id_dominio=estados.id_dominio
left join ips on dominios.id_dominio=ips.id_dominio
left join quantidade_dnss on dominios.id_dominio=quantidade_dnss.id_dominio
left join responsaveis_tecnicos on dominios.id_dominio=responsaveis_tecnicos.id_dominio left join titulares on dominios.id_dominio=titulares.id_dominio WHERE dominios.estado not like 2 and dominios.estado not like 0 AND data_expiracao > '20120730' GROUP BY dominio ORDER BY `datas`.`data_insercao` asc LIMIT 0, 10

General considerations
I'm not sure what columsn you have in aColumns, or what table that dominio column comes from. When you group a number of rows using GROUP BY, then the columns you select for your result should either have the same value for all rows of the group (i.e. be functionally dependent), or should be some aggregate function combining the values of all the rows in the group.
Some SQL dialects enforce this. MySQL doesn't, but if you select an unaggregated column which has different values within the group, there are no guarantees as to what value will actually be returned to you. It might come from any row within the group. So there is no way to get the “last” of these rows, as there isn't any inherent order. In simple cases you can use MIN or MAX to select the value you need. In more complicated cases, you'll most likely have to use subqueries to do the selection from within the groups.
For example, this answer computes for every Name (which corresponds to your dominio grouping column) the last value of Action based on an ordering by ascending Time. Or rather the first value using a descending ordering, which is the same.
Your application
As your comment below indicates that you want the maximal id_dominio for each dominio in dominios, I suggest the following:
SELECT …
FROM (SELECT MAX(id_dominio) AS id_dominio
FROM dominios
GROUP BY dominio
WHERE estado <> 2
AND estado <> 0
) domIds
LEFT JOIN datas ON domIds.id_dominio=datas.id_dominio
…
So there will be one subquery to compute the maximal id_dominio for each dominio group, and all subsequent joins can use the IDs from that subquery instead of the full dominio table. If you need other columns from the dominio table as well, you might have to include that in the join again, so that you can get all the values from those row3s whose IDs you selected in the subquery.

By default MySQL sorts records in ascending order, to get last records first you need to sort the records in DESCNDING ORDER:
$sOrder DESC

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

JOIN on multiple tables giving duplicate records - MySql - mysql

Related

MySQL - How to get one of the repeated records given a condition in SQL?

MySQL - Trying to show results for rows that have 0 records...across 3 columns

MySql query runs very slow(actually never gives output) without where clause

COUNT evaluate to zero if no matching records

Getting last element from Group By

Categories

Resources