Using DISTINCT inside JOIN is creating trouble [duplicate]

Using DISTINCT inside JOIN is creating trouble [duplicate] - mysql

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
How can I modify this query with two Inner Joins so that it stops giving duplicate results?
I'm having trouble getting my query to work.
SELECT itpitems.identifier, itpitems.name, itpitems.subtitle, itpitems.description, itpitems.itemimg, itpitems.mainprice, itpitems.upc, itpitems.isbn, itpitems.weight, itpitems.pages, itpitems.publisher, itpitems.medium_abbr, itpitems.medium_desc, itpitems.series_abbr, itpitems.series_desc, itpitems.voicing_desc, itpitems.pianolevel_desc, itpitems.bandgrade_desc, itpitems.category_code, itprank.overall_ranking, itpitnam.name AS artist, itpitnam.type_code FROM itpitems
INNER JOIN itprank ON (itprank.item_number = itpitems.identifier)
INNER JOIN (SELECT DISTINCT type_code FROM itpitnam) itpitnam ON (itprank.item_number = itpitnam.item_number)
WHERE mainprice > 1
LIMIT 3
I keep getting Unknown column 'itpitnam.name' in 'field list'.
However, if I change DISTINCT type_code to *, I do not get that error, but I do not get the results I want either.
This is a big result table so I am making a dummy example...
With *, I get something like:
+-----------+---------+----------+
| identifier| name | type_code|
+-----------+---------+----------+
| 2 | Joe | A |
| 2 | Amy | R |
| 7 | Mike | B |
+-----------+------------+-------+
The problem here is that I have two instances of identifier = 2 because the type_code is different. I have tried GROUP BY at the outside end of the query, but it is sifting through so many records it creates too much strain on the server, so I'm trying to find an alternative way of getting the results I need.
What I want to achieve (using the same dummy output) would look something like this:
+-----------+---------+----------+
| identifier| name | type_code|
+-----------+---------+----------+
| 2 | Joe | A |
| 7 | Mike | B |
| 8 | Sam | R |
+-----------+------------+-------+
It should skip over the duplicate identifier regardless if type_code is different.
Can someone help me modify this query to get the results as simulated in the above chart?

One approach is to use an inline view, like the query you already have. But instead of using DISTINCT, you would use a GROUP BY to eliminate duplicates. The simplest inline view to satisfy your requirements would be:
( SELECT n.item_number, n.name, n.type_code
FROM itpitnam n
GROUP BY n.item_number
) itpitnam
Although its not deterministic as to which row from itpitnam the values for name and type_code are retrieved from. A more elaborate inline view can make this more specific.
Another common approach to this type of problem is to use a correlated subquery in the SELECT list. For returning a small set of rows, this can perform reasonably well. But for returning large sets, there are more efficient approaches.
SELECT i.identifier
, i.name
, i.subtitle
, i.description
, i.itemimg
, i.mainprice
, i.upc
, i.isbn
, i.weight
, i.pages
, i.publisher
, i.medium_abbr
, i.medium_desc
, i.series_abbr
, i.series_desc
, i.voicing_desc
, i.pianolevel_desc
, i.bandgrade_desc
, i.category_code
, r.overall_ranking
, ( SELECT n1.name
FROM itpitnam n1
WHERE n1.item_number = r.item_number
ORDER BY n1.type_code, n1.name
LIMIT 1
) AS artist
, ( SELECT n2.type_code
FROM itpitnam n2
WHERE n2.item_number = r.item_number
ORDER BY n2.type_code, n2.name
LIMIT 1
) AS type_code
FROM itpitems i
JOIN itprank r
ON r.item_number = i.identifier
WHERE mainprice > 1
LIMIT 3
That query will return the specified resultset, with one significant difference. The original query shows an INNER JOIN to the itpitnam table. That means that a row will be returned ONLY of there is a matching row in the itpitnam table. The query above, however, emulates an OUTER JOIN, the query will return a row when there is no matching row found in itpitnam.
UPDATE
For best performance of those correlated subqueries, you'll want an appropriate index available,
... ON itpitnam (item_number, type_code, name)
That index is most appropriate because it's a "covering index", the query can be satisfied entirely from the index without referencing data pages in the underlying table, and there's equality predicate on the leading column, and an ORDER BY on the next two columns, so that will a avoid a "sort" operation.
--
If you have a guarantee that either the type_code or name column in the itpitnam table is NOT NULL, you can add a predicate to eliminate the rows that are "missing" a matching row, e.g.
HAVING artist IS NOT NULL
(Adding that will likely have an impact on performance.) Absent that kind of guarantee, you'd need to add an INNER JOIN or a predicate that tests for the existence of a matching row, to get an INNER JOIN behavior.

SELECT a.*
b.overall_ranking,
c.name AS artist,
c.type_code
FROM itpitems a
INNER JOIN itprank b
ON b.item_number = a.identifier
INNER JOIN itpitnam c
ON b.item_number = c.item_number
INNER JOIN
(
SELECT item_number, MAX(type_code) code
FROM itpitnam
GROUP BY item_number
) d ON c.item_number = d.item_number AND
c.type_code = d.code
WHERE mainprice > 1
LIMIT 3
Follow-up question: can you please post the table schema and how are the tables related with each other? So I will know what are the columns to be linked.

Related

SQL Distinct based on different colum

I have problem to distinct values on column based on other column. The case study is:
Table: List
well | wbore | op|
------------------
wella|wbore_a|op_a|
wella|wbore_a|op_b|
wella|wbore_a|op_b|
wella|wbore_b|op_c|
wella|wbore_b|op_c|
wellb|wbore_g|op_t|
wellb|wbore_g|op_t|
wellb|wbore_h|op_k|
So, I want the output to be appear in different field/column like:
well | total_wbore | total_op
----------------------------
wella | 2 | 3
---------------------------
wellb | 2 | 2
the real study case come from different table but to simplify it I just assume this case happened in 1 table.
The sql query that I tried:
SELECT well.well_name, wellbore.wellbore_name, operation.operation_name, COUNT(*)
FROM well
INNER JOIN wellbore ON wellbore.well_uid = well.well_uid
INNER JOIN operation ON wellbore.well_uid = operation.well_uid
GROUP BY well.well_name,wellbore.wellbore_name
HAVING COUNT(*) > 1
But this query is to calculate the duplicate row which not meet the requirement. Anyone can help?

you need to use count distinct
SELECT
count(distinct wellbore.wellbore_name) as total_wbore
count(distinct operation.operation_name) as total_op
FROM well
INNER JOIN wellbore ON wellbore.well_uid = well.well_uid
INNER JOIN operation ON wellbore.well_uid = operation.well_uid

Final query:
SELECT
well.well_name,
COUNT(DISTINCT wellbore.wellbore_name) AS total_wbore,
COUNT(DISTINCT operation.operation_name) AS total_op
FROM well
INNER JOIN wellbore ON wellbore.well_uid = well.well_uid
INNER JOIN operation ON wellbore.well_uid = operation.well_uid
GROUP BY well.well_name

Static SQL query replace to dynamic column

I have following query:
http://www.sqlfiddle.com/#!9/752e34/3
This query use SELECT in SELECT queries.
"SELECT a.*
,(SELECT s.value FROM tbl_scd AS s WHERE s.tag_id = 1 AND s.main_id = a.id ORDER BY s.date_time DESC LIMIT 1) AS title
,(SELECT s.value FROM tbl_scd AS s WHERE s.tag_id = 2 AND s.main_id = a.id ORDER BY s.date_time DESC LIMIT 1) AS alt
FROM tbl_main AS a
WHERE 1;"
Now I'm looking for a solution to add a new row into tbl_tag without change the above query (that the SELECT in SELECT part will be dynamic) to get a reference to tbl_tag
To get this:
+----+---------------+-----------+-----------+--------------+
| id | date | title | alt | new_column |
+----+---------------+-----------+-----------+--------------+
| 1 | 2018-10-10 | test1-1 | test1-3 | NULL |
| 2 | 2018-10-11 | test2-1 | test2-1 | NULL |
+----+---------------+-----------+-----------+--------------+
It would be great to get an idea or help.
Thanks

Your last comment on your question about using JOIN makes it clearer to me (I think) what you are after. JOINs will definitely help you a lot here, in place of the rather cumbersome query you are currently using.
Try this:
SELECT
tbl_main.date,
tblA.value AS title,
tblB.value AS alt
FROM
tbl_main
INNER JOIN (SELECT main_id, tag_id, value
FROM tbl_scd
INNER JOIN tbl_tag ON (tbl_scd.tag_id = tbl_tag.id)
WHERE tbl_tag.name = 'title') tblA
ON (tbl_main.id = tblA.main_id)
INNER JOIN (SELECT main_id, tag_id, value
FROM tbl_scd
INNER JOIN tbl_tag ON (tbl_scd.tag_id = tbl_tag.id)
WHERE tbl_tag.name = 'alt') tblB
ON (tbl_main.id = tblB.main_id);
I think this will get you much closer to a general solution to what it looks like you are trying to achieve, or at least point you in a good direction with using JOINs.
I also think you might benefit from re-thinking your database design, because this kind of pivoting rows from one table into columns in a query output can be an indicator that the data might be better off structured differently.
In any case, I hope this helps.

An explanation with SQL query

I trying to get some data for my JavaFX Application from a couple of tables in database with MySQl.
Here's the query:
select veturattable.id, veturattable.vetura,veturattable.modeli,veturattable.ngjyra,
veturattable.targa, renttable.pagesa, hargjimettable.shuma
from veturattable
left join hargjimettable
on hargjimettable.veturaid= veturattable.id
left join renttable
on renttable.veturaid = veturattable.id ;
Here are datas from rentable
And here are datas from hargjimettable
So what I need is to show me this one:
veturaid | pagesa | shuma
1 | 150 | 91
10 | 110 | 40

You actually need to do two subqueries pre-aggregating the sum amounts per respective ID. Then join each individually back to the main. If you don't, you are getting a Cartesian product. For every record in the hargjimettable table for a given ID, it is joined to the renttable for each amount there. So, if you have 2 records in first table and 3 records in the second, you are getting a multiple of 6.
By pre-querying each grouping by the one ID key respectively, you will only have at most, one record for each possible summation. So grab that record if it exists. The left-join prevents some IDs from not showing up. Using coalesce() prevents nulls from showing.
select
v.id,
v.vetura,
v.modeli,
v.ngjyra,
v.targa,
COALESCE( RSum.SumPagesa, 0 ) as AllPagesa,
COALESCE( HSum.SumShuma, 0 ) as AllShuma
from
veturattable v
left join
( select
h.veturaid,
SUM( h.shuma ) as SumShuma
from
hargjimettable h
group by
h.veturaid ) HSum
ON v.id = HSum.veturaid
left join
( select
r.veturaid,
SUM( r.pagesa ) as SumPagesa
from
renttable r
group by
r.veturaid ) RSum
ON v.id = RSum.veturaid

You actually want the MAX() and SUM() along the GROUP BY like
select max(veturattable.id) as id, max(veturattable.vetura) as vetura,
max(veturattable.modeli) as modeli,
max(veturattable.ngjyra) as ngjyra,
max(veturattable.targa) as targa,
max(renttable.pagesa) as pagesa,
sum(hargjimettable.shuma) as shuma
from veturattable
left join hargjimettable
on hargjimettable.veturaid= veturattable.id
left join renttable
on renttable.veturaid = veturattable.id
group by veturattable.id;

MySQL simple localization

Im trying to make some sort of localization in my DB.
For example I have 3 tables(img 1). Languages table contains different languages. Localization table has 3 fields: "id" - id of the string, 'language' - language of the string(id and language are my primary key), 'value' - localized string. tableOne has 'id', 'Col1' and 'Col2' - these fields contain IDs of the localizeable strings.
So after localizing I expect to get one of green tables instead of original(depending on a language parameter).
I've made it this way and it works, but I'd like to know is there any other better way to make it because now I have to create INNER JOIN block for each column, which must be localized. Im just scared that it will be very very slow.
I tried to create a temporary table to select all records of the required language and then i was doing the same. Inner joins, but searches should be performed only among the records of one language. But its not working because i still had to use multiple inner joins with that temp table which is impossible.
SELECT
`One`.`id` AS 'id',
`loc1`.`value` AS 'Col1',
`loc2`.`value` AS 'Col2'
FROM
`tableOne` AS `One`
INNER JOIN
`localization` AS `loc1`
ON `loc1`.`id` = `One.Col1`
AND `loc1`.`language` = 'en'
INNER JOIN
`localization` AS `loc2`
ON `loc2`.`id` = `One.Col2`
AND `loc2`.`language` = 'en'
img 1

If you want to reduce the number of JOINS needed, try displaying the values in rows instead of columns. You could do so like this:
SET #lang := 'en';
SELECT 1, tmp.value
FROM(
SELECT value
FROM localization
WHERE language = #lang AND id IN(543, 345)) tmp;
I first set a language parameter, and then I pull all values for that language from the localization table, using the ids inside an IN operator. You'll get results like this:
| 1 | one |
| 1 | two |
If you have to use the format given in the first table, try doing one inner join where you pull for the specific language and ids like this:
SELECT t1.id, t1.col1, t1.col2,
CASE WHEN l.id = t1.col1 THEN l.value ELSE null END AS col1Value,
CASE WHEN l.id = t1.col2 THEN l.value ELSE null END AS col2Value
FROM firstTable t1
JOIN localization l ON l.id IN (t1.col1, t1.col2) AND l.language = #lang;
Unfortunately, this won't give you the final solution, it will give you values like:
| 1 | 543 | 345 | one | null |
| 1 | 543 | 345 | null | two |
To wrap those into one column and remove nulls, just add MAX():
This will run a case statement for each column you have, but it will only have one JOIN and looks a little more manageable:
SELECT t1.id,
MAX(CASE WHEN l.id = t1.col1 THEN l.value ELSE null END) AS col1Value,
MAX(CASE WHEN l.id = t1.col2 THEN l.value ELSE null END) AS col2Value
FROM firstTable t1
JOIN localization l ON l.id IN (t1.col1, t1.col2) AND l.language = #lang
GROUP BY t1.id;
Here is an SQL Fiddle example. I don't think the case blocks will bog you down too much, but let me know how this preforms against your actual database.

Find records which matches certain constrain based on NOT EXIST

There is a query which select some data. I'm creating a suppression table which would ignore the rows containing certain data.
suppression: occasion_id | days_before
reminder: id | days_before | occasion_id
I'm making use of NOT EXIST to ignore selection of certain reminders.
The query is
SELECT id
from Reminder AS r
WHERE NOT EXIST (SELECT 1
FROM Suppression s
WHERE s.occasion_id = r.occasion_id
AND s.days_before = r.days_before)
Reminder:1) 101| 1 |18
2) 102| 7| 18
Suppresion: 18 | 1
The 1st reminder should be ignored and the 2nd one should be included.
As an example, if suppression table contain occasion_id - 18 and days_before- 1 the select should ignore reminder containing those data.
The sub query returns '1' in the case of 2nd reminder also. Why does it happens even if the statement after WHERE clause yields no result?

You are joining by the wrong condition.
Change the joining condition from s.id = r.id to s.occasion_id = r.occasion_id
SELECT r.*
FROM Reminder AS r
inner JOIN Supression s ON s.occasion_id = r.occasion_id
AND s.days_before <> r.days_before
Fiddle
Your query also works.. Just change the joining condition
SELECT id
from Reminder AS r
WHERE NOT EXISTs (SELECT 1
FROM Supression s
WHERE s.occasion_id = r.occasion_id
AND s.days_before = r.days_before)

I believe what you're trying to accomplish could be more easily done with a simple LEFT JOIN and an IS NULL in your where clause:
SELECT r.id
FROM Reminder AS r
LEFT JOIN Supression s ON s.id = r.id AND s.days_before = r.days_before
WHERE s.id IS NULL

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Using DISTINCT inside JOIN is creating trouble [duplicate] - mysql

Related

SQL Distinct based on different colum

Static SQL query replace to dynamic column

An explanation with SQL query

MySQL simple localization

Find records which matches certain constrain based on NOT EXIST

Categories

Resources