MySQL Select WHERE IN recordset - mysql

I'll try to explain my problem. I have two tables. In the first one each record is identified by a unique INT code (counter). In the second the code from the first table is one of the fields (and may be repeated in various records).
I want to make a SELECT CODE in the second table, based on WHERE parameters, knowing I will get as result a recordset with possibly repeated CODES, and use this recordset for another SELECT in the first table, WHERE CODE IN the above recordset (from the second table).
Is this possible ?? And if yes, how to do this ?
Usually, if I use the WHERE IN clause, the array can contain repeated values like WHERE Code IN "3,4,5,6,3,4,2" ... right ? The difference here is that I want to use a previously Selected recordset in place of the array.

Is this possible ?? Sure is.
And if yes, how to do this ? Like most questions answers depends. There's more than one way to skin this cat; and depending on data (volume of records), and indexes answers can vary.
You can use a distinct or group by to limit the table A records because the join from A--> b is a 1--> many thus we need to distinct or group by the values from A as they would be repeated. But if you need values from B as well, this is the way to do it.
Select A.Code, max(count B.A_CODE) countOfBRecords
from B
LEFT JOIN A
on A.Code = B.A_Code
WHERE B.Paramater = 'value'
and B.Paramater2 = 'Value2'
group by A.Code)
Or using your approach (works if you ONLY need values/columns from table A.)
Select A.Code
from A
Where code in (Select B.A_CODE
From B WHERE B.Paramater = 'value'
and B.Paramater2 = 'Value2')
But these can be slow depending on data/indexes.
You don't need the distinct on the inner query as A.Code only exists once and thus wouldn't be repeated. It's the JOIN which would cause the records to repeat not the where clause.
-Correlated Subquery will return a single A.Code works if you ONLY need values from table A.
Select A.Code
From A
Where exists (Select 1
from B
where b.paramter = value ...
AND A.Code = B.A_CODE)
Since there's no join no A.records would be repeated. On larger data sets this generally performs better .
This last approach works because it Correlates the outer table with sub select Note this can only go 1 level in a relationship. If you had multiple levels deep trying to join this way, it woudln't work.

Related

MYSQL - SubSelect when FK does and doesnt exists

Situation Overview
The current question is a problem about selecting values from two tables table A (material) and table B (MaterialRevision). However, The PK of table A might or Might not exist in Table B. When it doesnt exists, the query described in this question wont return the values of table A, but IT SHOULD. So basically here's whats happening :
The query is only returning values when A.id exists in B.id, when In fact, I need it to return values from A when A.id ALSO dont exist in B.id.
Problem:
Suppose two tables. Table Material and Table Material Revision.
Notice that the PK idMaterial is a FK in MaterialRevision.
Current "Mock" Tables
Query Objective
Obs: remember these two tables are a simplification of the real
tables.
For each Material, print the material variables and the last(MAX) RevisionDate from MaterialRevision. In case theres no RevisionDate, print BLANK ("") for the "last revision date".
What is wrongly happening
For each Material, print the material variables and the last(MAX) RevisionDate from MaterialRevision. In case theres no Revision for the Material, doesnt print the Material (SKIP).
Current Code
SELECT
Material.idMaterial,
Material.nextRevisionDate,
Material.obsolete,
lastRevisionDate
FROM Material,
(SELECT MaterialRevision.idMaterial, max(MaterialRevision.revisionDate) as "revisionDate" from MaterialRevision
GROUP BY MaterialRevision.idMaterial
) AS Revision
WHERE (Material.idMaterial = Revision.idMaterial AND Material.obsolete = 0)
References and Links used to reach the state described in this question
Why is MAX() 100 times slower than ORDER BY ... LIMIT 1?
MySQL get last date records from multiple
MySQL - How to SELECT based on value of another SELECT
MySQL Query Select where id does not exist in the JOIN table
PS I hope this question is correctly understood since it took me a lot of time to build it. I researched a lot in stackoverflow and after
several failed attempts I had no option but to ask for help.
You should use JOIN :
SELECT m.idMaterial, m.nextRevisionDate, mr.revisionDate AS "lastRevisionDate"
FROM Material m
LEFT JOIN MaterialRevision AS mr ON mr.idMaterial = m.idMaterial AND mr.revisionDate = (
SELECT MAX(ch.revisionDate) from MaterialRevision ch
WHERE mr.idMaterial = ch.idMaterial)
WHERE m.obsolete = 0
Here is an explanation of what INNER JOIN, LEFT JOIN and RIGHT JOIN are. (You will love them if you often cross tables in your queries)
As m.obsolete will always be true, I ommited it in the SELECT clause
You should use the left outer join instead of using the cross product.
You're query should be something like this:
SELECT idMaterial, nextRevisionableDate, obsolete,
revisionDate AS lastRevisionDate
FROM Material
LEFT OUTER JOIN MaterialRevision AS mr On
Material.idMaterial = MaterialRevision.id
AND mr.revisionDate = (SELECT MAX(ch.revisionDate) from MaterialRevision ch
WHERE mr.idMaterial = ch.idMaterial)
WHERE obsolete = 0;
Here you can find some documentation about types of join.

Join on 3 tables insanely slow on giant tables

I have a query which goes like this:
SELECT insanlyBigTable.description_short,
insanlyBigTable.id AS insanlyBigTable,
insanlyBigTable.type AS insanlyBigTableLol,
catalogpartner.id AS catalogpartner_id
FROM insanlyBigTable
INNER JOIN smallerTable ON smallerTable.id = insanlyBigTable.catalog_id
INNER JOIN smallerTable1 ON smallerTable1.catalog_id = smallerTable.id
AND smallerTable1.buyer_id = 'xxx'
WHERE smallerTable1.cont = 'Y' AND insanlyBigTable.type IN ('111','222','33')
GROUP BY smallerTable.id;
Now, when I run the query first time it copies the giant table into a temp table... I want to know how I can prevent that? I am considering a nested query, or even to reverse the join (not sure the effect would be to run faster), but that is well, not nice. Any other suggestions?
To figure out how to optimize your query, we first have to boil down exactly what it is selecting so that we can preserve that information while we change things around.
What your query does
So, it looks like we need the following
The GROUP BY clause limits the results to at most one row per catalog_id
smallerTable1.cont = 'Y', insanelyBigTable.type IN ('111','222','33'), and buyer_id = 'xxx' appear to be the filters on the query.
And we want data from insanlyBigTable and ... catalogpartner? I would guess that catalogpartner is smallerTable1, due to the id of smallerTable being linked to the catalog_id of the other tables.
I'm not sure on what the purpose of including the buyer_id filter on the ON clause was for, but unless you tell me differently, I'll assume the fact it is on the ON clause is unimportant.
The point of the query
I am unsure about the intent of the query, based on that GROUP BY statement. You will obtain just one row per catalog_id in the insanelyBigTable, but you don't appear to care which row it is. Indeed, the fact that you can run this query at all is due to a special non-standard feature in MySQL that lets you SELECT columns that do not appear in the GROUP BY statement... however, you don't get to select WHICH columns. This means you could have information from 4 different rows for each of your selected items.
My best guess, based on column names, is that you are trying to bring back a list of items that are in the same catalog as something that was purchased by a given buyer, but without any more than one item per catalog. In addition, you want something to connect back to the purchased item in that catalog, via the catalogpartner table's id.
So, something probably akin to amazon's "You may like these items because you purchased these other items" feature.
The new query
We want 1 row per insanlyBigTable.catalog_id, based on which catalog_id exists in smallerTable1, after filtering.
SELECT
ibt.description_short,
ibt.id AS insanlyBigTable,
ibt.type AS insanlyBigTableLol,
(
SELECT smallerTable1.id FROM smallerTable1 st
WHERE st.buyer_id = 'xxx'
AND st.cont = 'Y'
AND st.catalog_id = ibt.catalog_id
LIMIT 1
) AS catalogpartner_id
FROM insanlyBigTable ibt
WHERE ibt.id IN (
SELECT (
SELECT ibt.id AS ibt_id
FROM insanlyBigTable ibt
WHERE ibt.catalog_id = sti.catalog_id
LIMIT 1
) AS ibt_id
FROM (
SELECT DISTINCT(catalog_id) FROM smallerTable1 st
WHERE st.buyer_id = 'xxx'
AND st.cont = 'Y'
AND EXISTS (
SELECT * FROM insanlyBigTable ibt
WHERE ibt.type IN ('111','222','33')
AND ibt.catalog_id = st.catalog_id
)
) AS sti
)
This query should generate the same result as your original query, but it breaks things down into smaller queries to avoid the use (and abuse) of the GROUP BY clause on the insanlyBigTable.
Give it a try and let me know if you run into problems.

Join TableA, TableB and TableC to get data from TableA and TableC

Why I am getting so many records for this
SELECT e.OneColumn, fb.OtherColumn
FROM dbo.TABLEA FB
INNER JOIN dbo.TABLEB eo ON Fb.Primary = eo.foregin
INNER JOIN dbo.TABLEC e ON eo.Primary =e.Foreign
WHERE FB.SomeOtherColumn = 0
When I am running this I am getting Millions of records which is not the correct case, all tables has less number of records.
I need to get the columns from TableA and TableC and because they are not joined logically so I have to use TableB to act as bridge
EDIT
Below is the count:
TABLEA = 273551
TABLEB = 384412
TABKEC = 13046
Above Query = After 2 minutes I have forcefully canceled the query.. till that time the count was 11437613
Any suggestion?
To figure out what is going on in such a query where the results are not as expected, I tend to do this. First I change to a SELECT * (Note this is only for figuring out the problem, do not use SELECT * on production, ever!) Then I add an order by for the ID frield from tableA if there is not one in the query.
So now I run the query up to the first table including any where conditions that are from the first table. I comment out the rest. I note the number of records returned.
Now I add in the second table and any where conditions from it. If I am expecting a one to relationship, and if this query doesn't return the smae number of records, then I look at the data that is being returned to see if I can figure out why. Since the contents are ordered by the table1 ID, you can ususally see examples of some records that are duplicated fairly easily and then scroll over until you find the field that causes the differnce. Often this means that you need some sort of addtional where clause or aggregation on the fields in the next table to limit to only one record. JUSt note down the problem at this point though as you may be able tomake the change more effectively in the next join.
So add inteh the third table and again, not the number of records and then look closely at the data where the id from A is repeated. LOok at the columns you intend to return, are they always teh same for an id? If they are differnt then you do not havea one-one relationship and you need to understand that either theri is a data integrity problem or you are mistaken in thinking there is a one-to-one. If tehy are the same, then a derived table may be in order. You only need the ids from tableb so the join could look something like this:
JOIN (SELECT MIn(Primary), foreign FROM TABLEB GROUP BY foreign) EO ON Fb.Primary = eo.foreign
Hope this helps.

Conditional Select Statement in MS Access

I want to create a query in MS Access that will display information from two tables based on the values in one table. Both of these tables have the same exact columns. One has set records and the other one has records a visitor can insert/edit/delete. For the purpose of this question I will call the tables TableA and TableB. TableA has the predetermined records and can not be changed. Multiple users will be using these records. Visitors would add records to TableB. I need a query that will display the records from TableA unless a visitor adds a record to TableB and then it displays that record. The field I need to join on is CategoryID. So what I need is basically like this;
If TableB.CategoryID Is Not Null Then
Select * From TableB
Else
Select * From TableA
End If
Thanks for any assistance anyone can provide.
JW
You get part of the way there by unioning the individual table queries; that works if there's nothing in B, but shows the A records if there is.
So suppose we created a table just like A, say A2, but with an added column: the number of records in B. And then we select all of the records in A2 where this new column 0, and only the columns originally in A; call this A3.
Now consider the union of A3 & B. If B is empty, we get A. If B is not empty, then none of the records from A2 will be chosen for A3, and we'e left with just B.
That is easier than it seems at first. You'll have to join both tables on CategoryID and then conditionally select the right item like this:
SELECT tA.CategoryID, IIF(tB.CategoryID IS NULL, tA.txtEntry, tB.txtEntry) AS EntryText,
tB.CategoryID IS NULL AS bOriginalEntry
FROM TableA AS tA LEFT JOIN TableB AS tB ON tA.CategoryID=tB.CategoryID
However there is one caveat: If TableB is empty then the join is producing an empty set! Just populate TableB with at least one record (preferably one with an invalid CategoryID so it won't join with a valid record in TableA.
The bOriginalEntry is just a boolean expression to show whether the EntryText stems from TableA or TableB.
I found this thread searching for a similar problem. Note to self and others.
You can use the Join Types to cope with potential different values in a conditional select,
MS Access doesn't have the full range of JOIN that MS SQL has, but you can "fudge" it.
eg
Full outer joins: all the data, combined where feasible
In some systems, an outer join can include all rows from both tables, with rows combined when they correspond. This is called a full outer join, and Access doesn’t explicitly support them. However, you can use a cross join and criteria to achieve the same effect.
https://support.office.com/en-us/article/join-tables-and-queries-3f5838bd-24a0-4832-9bc1-07061a1478f6#typesofjoins

Selecting multiple columns/fields in MySQL subquery

Basically, there is an attribute table and translation table - many translations for one attribute.
I need to select id and value from translation for each attribute in a specified language, even if there is no translation record in that language. Either I am missing some join technique or join (without involving language table) is not working here since the following do not return attributes with non-existing translations in the specified language.
select a.attribute, at.id, at.translation
from attribute a left join attributeTranslation at on a.id=at.attribute
where al.language=1;
So I am using subqueries like this, problem here is making two subqueries to the same table with the same parameters (feels like performance drain unless MySQL groups those, which I doubt since it makes you do many similar subqueries)
select attribute,
(select id from attributeTranslation where attribute=a.id and language=1),
(select translation from attributeTranslation where attribute=a.id and language=1),
from attribute a;
I would like to be able to get id and translation from one query, so I concat columns and get the id from string later, which is at least making single subquery but still not looking right.
select attribute,
(select concat(id,';',title)
from offerAttribute_language
where offerAttribute=a.id and _language=1
)
from offerAttribute a
So the question part.
Is there a way to get multiple columns from a single subquery or should I use two subqueries (MySQL is smart enough to group them?) or is joining the following way to go:
[[attribute to language] to translation] (joining 3 tables seems like a worse performance than subquery).
Yes, you can do this. The knack you need is the concept that there are two ways of getting tables out of the table server. One way is ..
FROM TABLE A
The other way is
FROM (SELECT col as name1, col2 as name2 FROM ...) B
Notice that the select clause and the parentheses around it are a table, a virtual table.
So, using your second code example (I am guessing at the columns you are hoping to retrieve here):
SELECT a.attr, b.id, b.trans, b.lang
FROM attribute a
JOIN (
SELECT at.id AS id, at.translation AS trans, at.language AS lang, a.attribute
FROM attributeTranslation at
) b ON (a.id = b.attribute AND b.lang = 1)
Notice that your real table attribute is the first table in this join, and that this virtual table I've called b is the second table.
This technique comes in especially handy when the virtual table is a summary table of some kind. e.g.
SELECT a.attr, b.id, b.trans, b.lang, c.langcount
FROM attribute a
JOIN (
SELECT at.id AS id, at.translation AS trans, at.language AS lang, at.attribute
FROM attributeTranslation at
) b ON (a.id = b.attribute AND b.lang = 1)
JOIN (
SELECT count(*) AS langcount, at.attribute
FROM attributeTranslation at
GROUP BY at.attribute
) c ON (a.id = c.attribute)
See how that goes? You've generated a virtual table c containing two columns, joined it to the other two, used one of the columns for the ON clause, and returned the other as a column in your result set.