What methods are recommended for Selecting multiple columns within a nested subquery? It's been a while since I've coded any queries and I'm having some difficulty wrapping my head around this. The specific challenge is on Line 2 of the code below. The IN operand doesn't quite work here (see error message below), and I'm not sure if it's simply a matter of the syntax I'm using, and/or there is a much better way to go about this (i.e. using the HAVING operand or a JOIN statement)
SELECT * FROM Rules WHERE Rules.LNRule_id
IN(SELECT LNRule_id1,LNRule_id2,LNRule_id3,LNRule_id4 FROM Silhouette
WHERE Silhouette.Silhouette_Skirt=(SELECT Silhouette_Skirt FROM Style
WHERE Style.Style_Skirt='$Style_Skirt')
)
The purpose of this query is to SELECT all the relevant rows in table Rules for a particular value in table Style (i.e. $Style_Skirt), which it does by matching it to one of several factors - in this case the garment's Silhouette. What I am therefore trying to do in this portion of the query is SELECT all rows in table Rules who's ID (LNRule_id) matches values in any of the specified columns in table Silhouette
(SELECT LNRule_id1,LNRule_id2,LNRule_id3,LNRule_id4 FROM Silhouette WHERE Silhouette.Silhouette_Skirt=(...))
Edit
There is a many-to-many relationship (each Silhouette has several applicable Rules, and each Rule can apply to several Silhouettes). All the rules reside in the table 'Rules' (one per row), and each rule has an id ('LNRule_id'). The table 'Silhouette' has columns which tell it which rows need to be called from 'Rules' by 'LNRule_id' (LNRule_id1,2,3,4 indicate which Rules should be called, and store the values of the id's for the relevant rows in table 'Rules')
The error message currently being generated by the IN Operand is:
SQLSTATE[21000]: Cardinality violation: 1241 Operand should contain 1
column(s)
I think you want this query
SELECT * FROM Rules r
JOIN Silhouette s
(ON r.LNRule_id=s.LNRule_id1
OR r.LNRule_id=s.LNRule_id2
OR r.LNRule_id=s.LNRule_id3
OR r.LNRule_id=s.LNRule_id4)
JOIN Style st
ON s.Silhouette_Skirt=st.Silhouette_Skirt
WHERE st.Silhouette_Skirt = '$Style_Skirt'
mysql is complaining that on one side of IN you have a single column, and on the other side you have a multi-column rowset. In order for the IN operator to work, the rowset on the right side of IN must have the exact same number of columns as the left side; in this case, one column.
What you are trying to accomplish could perhaps be achieved if you did something like WHERE LNRule_id IN( SELECT LNRule_Id1 ...) OR LNRule_id IN( SELECT LNRule_Id2 ...) OR ... OR ... but the resulting query would be a monstrosity, and its performance would be horrendous. There may be other ways to go about it too, but anything you try will probably be similarly atrocious.
I do not have enough information to be absolutely sure about what I am saying, but it seems to me that the reason why you have this problem is that your database schema is not normalized. Generally, whenever you see a table with a group of columns having names that all begin with the same prefix and end with a number, it means that someone, somewhere, did not normalize their data.
To address the edit in your question, what you have implemented might conceptually be a many to many relationship, but as far as relational databases are concerned, (you know, the science, the theory, the established practices, the approaches necessary to get things to actually work,) it is definitely not a many to many relationship. Many to many relationships are most certainly not implemented with column1, column2, column3, ... columnN. To be sure that I am not making this stuff up, you can read what others say about many to many relationships here:
Many-to-many relations in RDBMS databases
So, my suggestion, if I correctly understand what is going on, would be to introduce a new table, called SilhouetteRules, which contains two columns, silhouette_id and rule_id. This table will implement a many-to-many relationship between silhouettes and your rules. Then of course you get rid of all the rule1, rule2, rule3, etc. columns from Silhouette.
Once you have done that, you can obtain all silhouettes and all rules associated with them using a query like this:
SELECT * FROM Silhouette
LEFT JOIN SilhouetteRules ON
Silhouette.id = SilhouetteRules.silhouette_id
LEFT JOIN Rules ON
SilhouetteRules.rule_id = Rules.id
The above query will yield multiple rows for each silhouette, where the silhouette fields will be identical from row to row, and only the rule fields will differ. Do not be surprised by this, that's how relational databases work.
Given a given_silhouette_id, you can retrieve all rules associated with it using a query like this:
SELECT * FROM Rules
LEFT JOIN SilhouetteRules ON
Rules.id = SilhouetteRules.rule_id
WHERE
SilhouetteRules.silhouette_id = given_silhouette_id
So, you are going to be using this query as a subquery in queries like the one in the question.
Now, regarding the query in the question, I am unable to tell you exactly how you would need to modify it to get it to work with the normalization that I proposed, because I cannot make sense of it. You see, even if you fix the problem that you currently have with SELECT * FROM table WHERE single-column IN multi-column-rowset, there is another problem further down: the WHERE Silhouette.Silhouette_Skirt=(SELECT ... part would not work either, because you cannot compare the value of a column against the result of a select statement. So, I do not know what you are trying to do there. Hopefully, once you normalize your schema and fix the first problem with your query, then the solution to the second problem will become obvious, or you can ask another question on stackoverflow.
P.S. did Mihai's answer work?
Related
I have two tables, meta_works(docID, tag, content) and meta_authors(authID, tag, content). Both tables have rows which include:
"xID", "authEng", "Author Name";
"xID", "authDate", "date range string";
"xID", "authLang", "language";
A distinct grouping in either table using the content from the tags of authEng + authDate + authLang will always have exactly one meta_authors.authID associated with it. One problem is I don't know how to do this distinct grouping across rows using multiple values of the same column.
I need to insert this distinct authID from meta_authors into meta_works for each distinct meta_works.docID as (docID=meta_works.docID, tag="authID", content=meta_authors.authID) so I can delete the other fields in meta_works and grab data from meta_authors instead.
The point of this attempted structure is to allow an arbitrary number of tags in either table associated with a work or author, and avoid duplicating the author data for each work, which helps with updating etc. The idea is to then expand this with a meta_sections table to allow meta tagging of work subsections as well.
I have tried to play around with inner/left/right joins and subqueries (never had to use these before due to limited mysql use) but cannot come up with anything that seems like even a start. The lack of examples is not for lack of trying/research, but with no experience this is proving difficult to approach. Perhaps it would be better to do this outside mysql and then update the table, but it intuitively feels well within the strengths of SQL so i feel it is worth the question.
Ok, for a moment, throw out of your mind "good database design". Let's say I have two tables, and they have some of the same columns.
item
-------
id
title
color
and
item_detail
-------
id
weight
color
In a good normal query, you'd choose the columns you want within the query, like so:
SELECT item.title, item_detail.color, item_detail.weight ...
But what if you are stuck with a query that was built with star/all:
SELECT * ...
In this case you would get two color columns pulled back in your results, one for each table. Is there a way in MySQL to chose one color column over the other, so only one shows up in the results, without a full rewrite of the statement? So that I could say that the table item_detail takes priority?
Probably not but I thought I'd ask.
Err. No there is not.
But define "without a full rewrite of the statement". As far as I can see you'd just need to rewrite the select * portion of the query.
If you cannot touch the statement at all, then you are free to ignore the column in your application (the order of the columns does not change between calls)... or you could create a view...
It's hard to know which constraints you are dealing with when you say "But what if you are stuck with a query".
I'm preparing for an exam in databases and SQL and I'm solving an exercise:
We have a database of 4 tables that represent a human resources company. The tables are:
applicant(a-id,a-name,a-city,years-of-study),
job(job-name,job-id),
qualified(a-id,job-id)
wish(a-id,job-id).
the table applicant represents the table of applicants obviously. And jobs is the table of available jobs. the table qualified shows what jobs a person is qualified for, and the table wish shows what jobs a person is interested in.
The question was to write a query that displays for each job-id, the number of applicants that are both qualified and interested to work in.
Here is the solution the teacher wrote:
Select q1.job_id
, count(q1.a_id)
from qualified as q1
, wish as w1
Where q1.a_id = w1.a_id
and q1.job_id = w1.job_id
Group by job_id;
That's all well and good, I'm not sure why we needed that "as q1" and "as w1", but i can see why it works.
And here is the solution I wrote:
SELECT job-id,COUNT(a-id) FROM job,qualified,wish WHERE (qualified.a-id=wish.a-id)
GROUP BY job-id
Why is my solution wrong? And also - From which table will it select the information? Suppose I write SELECT job-id FROM job,qualified,wish. From which table will it take the information? because job-id exists in all 3 of these tables.
You can only refer to tables mentioned in the FROM clause. If it's ambiguous (because more than one has a column of the same name) then you need to be explicit by qualifying the name. Usually the qualifier is an alias but it could also be the table name itself if an alias wasn't specified.
There's a concept of a "natural join" which joins tables on common column(s) between two tables. Not all systems support that notation but I think MySQL does. I believe these systems usually collapse the joined pairs into a single column.
select q1.job_id, count(q1.a_id) from qualified as q1, wish as w1
where q1.a_id = w1.a_id and q1.job_id = w1.job_id
group by job_id;
I don't think I've worked on any systems that would have accepted the query above because the grouping column would have been strictly unclear even though the intention really is not. So if it truly does work correctly on MySQL then my guess is that it recognizes the equivalence of the columns and cuts you some slack on the syntax.
By the way, your query appears to be incorrect because you only included a single column in a join that requires two columns. You also included a third table which means that your result will effectively do a cross join of every row in that table. The grouping is going to still going to reduce it to one row per job_id but the count is going to be multiplied by the number of rows in the job table. Perhaps you added that table thinking it would hurt to add it just in case you need it but that is not what it means at all.
Your query will list non-existing jobs in case the database has orphan records in applicant and qualified, and might also omit jobs that have no qualified and willing candidates.
I'm not exactly sure, because I have no idea if there's any database that will accept COUNT(a-id) when there's no information about the table from which to take this value.
edit: Interestingly it looks like both of these problems are shared by both of the solutions, but shawnt00 has a point: your solution makes a huge pointless cartesian of three tables: see it without the group by.
My current best guess for a working answer would therefore be http://sqlfiddle.com/#!9/09d0c/6
I noticed te other day I can joins in mysql just as easily by doing,
SELECT peeps, persons, friends FROM tablea JOIN tableb USING (id) WHERE id = ?
In stead of using,
SELECT a.peeps, a.persons, b.friends FROM tablea a JOIN tableb b USING (id) WHERE id = ?
It only works if there is no matching column names, why should I do the second rather than the first?
No, you don't need to, but in my humble opinion you really should. It's almost always better in my experience to be explicit with what you're trying to do.
Consider the feelings of the poor guy (or girl) who has to come behind you and try to figure out what you were trying to accomplish and in which tables each column resides. Explicitly stating the source of the column allows one to look at the query and glean that information without deep knowledge of the schema.
Query 1 will work (as long as there are no ambiguous column names).
Query 2 will
be clearer
be more maintainable (think of someone who doesn't know the database schema by heart)
survive the addition of an ambiguous column name to one of the tables
So, don't be lazy because of that pitiful few saved keystrokes.
It's not necessary if you have no duplicate column names. If you do, the query will fail.
Previous programmer left me with "beautiful" piece of code and he kind of forgot to apply something to it. There is a query which selects several items from several tables.
6 Items can be chosen. It means 6 tables can be chosen, however there can be more tables - even 20 of them. I need to get that list of tables from processed query.
Indeed, is it possible at all? Is there any command to get a list of tables the query used?
I'm unsure if I am understanding the question correctly but it may be worth 'explaining' the query and seeing which tables are being used as below.
EXPLAIN SELECT * FROM table1 JOIN
table2 USING (id)