the question is
Which of the following methods for providing explicit names for the columns in a view work?
a. Include a column list
b. Provide column aliases in the view SELECT statement
c. Rename the columns when you select from the view
answer
a. Works: Include a column list
b. Works: Provide column aliases in the view SELECT statement
c. Does not work: Rename the columns when you select from the view
regarding (c) what do they mean by "Rename the columns when you select from the view"?
I think the question in the certification guide is worded poorly. You can give explicit names to columns when you select from a view, and this works:
CREATE VIEW MyView AS SELECT a, b, c FROM MyTable;
SELECT a AS d, b AS e, c AS f FROM MyView;
The problem is not with giving aliases to columns explicitly. Here's the problem: if you rely on this instead of defining the view with distinct column names, and the view consists of a join such that the column names are ambiguous, you run into trouble:
CREATE VIEW MyView AS
SELECT m.a, m.b, m.c, o.a, o.b, o.c
FROM MyTable m JOIN OtherTable o;
This is not a valid view, because in the view definition, all column names must be distinct. For instance, you would get ambiguous results when you query the view:
SELECT a FROM MyView;
Does this select the first a column or the second a column?
So you must have a distinct set of column names in the view definition, it's not enough to make them distinct as you query the view.
This is the reason I think the certification guide question was poorly worded. It's not about renaming columns explicitly, it's about ensuring that the columns of the view have distinct names. This is a common reason for renaming columns, so that's probably why the person writing the question wrote it that way.
Either of the other techniques mentioned in the question can resolve the ambiguity:
CREATE VIEW MyView (a, b, c, d, e, f) AS
SELECT m.a, m.b, m.c, o.a, o.b, o.c
FROM MyTable m JOIN OtherTable o;
or
CREATE VIEW MyView AS
SELECT m.a, m.b, m.c, o.a AS d, o.b AS e, o.c AS f
FROM MyTable m JOIN OtherTable o;
Either way, you get the aliased columns:
SELECT * FROM MyView; -- returns result with columns a, b, c, d, e, f
By "rename when you select" they surely mean something like SELECT a AS b FROM theview etc. The reason it doesn't work for the given task of "providing explicit names for the columns" is that there need not be an explicit, unambigous a in the view for you to "rename"... UNLESS you've already disambiguated by methods (a) or (b) [[in which case you may also "rename" this way, but that's pretty much a secondary issue!-)]].
Related
Person is a table with columns PersonId, FirstName, LastName
Address is a table with columns PersonId, City, State
SELECT a.FirstName, a.LastName, b.City, b.State
FROM Person a, Address b
WHERE a.PersonId = b.PersonId;
I have two questions.
Please correct me if I'm wrong, but I'm guessing that the purpose of the (a., b.) extensions are used to indicate a specific column of an SQL table so there exists no ambiguity between selecting a column given two tables may have the same column name?
Is there a name for this?
These are called table aliases, and indeed their purpose is to indicate from which table each column comes from. This is mandatory to avoid ambiguity when a column by the same name exists in both tables - but also a general good practice, so people reading the query can understand it without having knowledge of the underlying table structures.
Note that you don’t necessarily need explicit aliases; you can also prefix the columns with the full table name if you like (like Person.FirstName): aliases just makes things shorter to write.
You should use meaningful aliases so it is easier to remember them through the query (Person would be best aliases p than a).
Finally, you should be using explicit, modern joins (with the on keyword) rather than old-school, implicit joins, whose syntax is not state of the art since ANSI SQL 92, decades ago.
Your query:
SELECT p.FirstName, p.LastName, a.City, a.State
FROM Person p
INNER JOIN Address a ON a.PersonId = p.PersonId;
Generally it's called an identifier qualifier:
https://dev.mysql.com/doc/refman/8.0/en/identifier-qualifiers.html
...a column name may be given a table-name qualifier, which itself may
be given a database-name qualifier. Examples of unqualified and
qualified column references in SELECT statements:
SELECT c1 FROM mytable
WHERE c2 > 100;
SELECT mytable.c1 FROM mytable
WHERE mytable.c2 > 100;
SELECT mydb.mytable.c1 FROM mydb.mytable
WHERE mydb.mytable.c2 > 100;
You can use a table alias as a qualifier either to make it shorter than the full table name, or because you are doing a self-join and you use aliases to make more than one reference to the same table name.
SELECT c1, c2, t1.c FROM db1.t AS t1 INNER JOIN db2.t AS t2
WHERE t2.c > 100;
2. Is there a name for this?
It's called giving identifier. In SELECT city.Name FROM city; city is the identifier to Name, in SELECT c.Name FROM city c; however alias c is the identifier to Name.
1. Please correct me if I'm wrong, but I'm guessing that the purpose
of
the (a., b.) extensions are used to indicate a specific column of an
SQL table so there exists no ambiguity between selecting a column
given two tables may have the same column name?
This is true apart from the aliases. One can achieve this without aliases by referencing the exact names of the tables.
Aliases help,
Changing the related table name in one place and in one place alone
e.g.
changing sales to stock in
SELECT *
FROM sales s
WHERE s.col1 ... s.col2 ... s.col3 ...
-- WHERE sales.col1 ... sales.col2 ... sales.col3 ...
is relatively easier.
Simplify the details of the query by shortening the columns refered
e.g.
It's harder to understand the query below than the one where it has
simple aliases:
SELECT
product_color_configuration.Name
FROM product_color_configuration
LEFT JOIN product_channel_permission ON product_color_configuration.ProductId = product_channel_permission.ProductId ...
Remove the ambiguity when the same table is referenced multiple
times distinctly:
SELECT
COALESCE(c1.Name, c2.Name) -- if Id exists prioritize
FROM table t
LEFT JOIN city c1 ON t.CityId = c1.Id
LEFT JOIN city c2 ON c2.Name REGEXP t.CityGuessedName
;
When I want to select all columns expect foo and bar, what I normally do is just explicitly list all the other columns in select statement.
select a, b, c, d, ... from ...
But if table has dozen columns, this is tedious process for simple means. What I would like to do instead, is something like the following pseudo statement:
select * except(foo, bar) from ...
I would also like to know, if there is a function to filter out rows from the result consisting of multiple columns, if multiple rows has same content in all corresponding columns. In other words duplicate rows would be filtered out.
------------------------
A | B | C
------------------------ ====> ------------------------
A | B | C A | B | C
------------------------ ------------------------
You can query INFORMATION_SCHEMA db and get the list of columns (except two) for that table, e.g.:
SELECT REPLACE(GROUP_CONCAT(COLUMN_NAME), '<foo,bar>,', '')
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = '<your_table>' AND TABLE_SCHEMA = '<database>';
Once you get the list of columns, you can use that in your select query.
You can create view based on this table with all columns except these two columns and then use this view everytime with
select * from view
simple group by on all column will remove such duplicates. there are other options as well - distinct and row_number.
select * except(foo, bar) from
This is a frequently requested feature on SO. However, it has not made it to the SQL Standard and I don't know of any SQL products that support it. I guess when the product managers ask their developers, MVPs, usergroups, etc to measure enthusiasm for this prospective feature, they mostly hear, "SELECT * FROM is considered dangerous, we need to protect new users who don't know what they are doing, etc."
You may find it useful to use NATURAL JOIN rather than INNER JOIN etc which removes what would be duplicated columns from the resulting table expression e.g.
SELECT *
FROM Table1 t1
INNER JOIN Table2 t2
ON t1.foo = t2.foo
AND t1.bar = t2.bar;
will result in two columns named foo and two named bar (and possibly other repeated names), probably de-duplicated in some way e.g. by suffixing the range variable names t1 and t2 that INNER JOIN forced you into using.
Whereas:
SELECT *
FROM Table1 NATURAL JOIN Table2;
doesn't require the use of range variables (a good thing) because there will only be one column named foo and one named bar in the result.
And to remove duplicated rows as well as columns changed the implied SELECT ALL * into the explicit SELECT DISTINCT * e.g.
SELECT DISTINCT *
FROM Table1 NATURAL JOIN Table2;
Doing this may reduce your need for the SELECT ALL BUT { these columns } feature you desire.
Of course, if you do that you will be told, "NATURAL JOIN is considered dangerous, we need to protect you from yourself in case you don't know what you are doing, etc." :)
Basically, there is an attribute table and translation table - many translations for one attribute.
I need to select id and value from translation for each attribute in a specified language, even if there is no translation record in that language. Either I am missing some join technique or join (without involving language table) is not working here since the following do not return attributes with non-existing translations in the specified language.
select a.attribute, at.id, at.translation
from attribute a left join attributeTranslation at on a.id=at.attribute
where al.language=1;
So I am using subqueries like this, problem here is making two subqueries to the same table with the same parameters (feels like performance drain unless MySQL groups those, which I doubt since it makes you do many similar subqueries)
select attribute,
(select id from attributeTranslation where attribute=a.id and language=1),
(select translation from attributeTranslation where attribute=a.id and language=1),
from attribute a;
I would like to be able to get id and translation from one query, so I concat columns and get the id from string later, which is at least making single subquery but still not looking right.
select attribute,
(select concat(id,';',title)
from offerAttribute_language
where offerAttribute=a.id and _language=1
)
from offerAttribute a
So the question part.
Is there a way to get multiple columns from a single subquery or should I use two subqueries (MySQL is smart enough to group them?) or is joining the following way to go:
[[attribute to language] to translation] (joining 3 tables seems like a worse performance than subquery).
Yes, you can do this. The knack you need is the concept that there are two ways of getting tables out of the table server. One way is ..
FROM TABLE A
The other way is
FROM (SELECT col as name1, col2 as name2 FROM ...) B
Notice that the select clause and the parentheses around it are a table, a virtual table.
So, using your second code example (I am guessing at the columns you are hoping to retrieve here):
SELECT a.attr, b.id, b.trans, b.lang
FROM attribute a
JOIN (
SELECT at.id AS id, at.translation AS trans, at.language AS lang, a.attribute
FROM attributeTranslation at
) b ON (a.id = b.attribute AND b.lang = 1)
Notice that your real table attribute is the first table in this join, and that this virtual table I've called b is the second table.
This technique comes in especially handy when the virtual table is a summary table of some kind. e.g.
SELECT a.attr, b.id, b.trans, b.lang, c.langcount
FROM attribute a
JOIN (
SELECT at.id AS id, at.translation AS trans, at.language AS lang, at.attribute
FROM attributeTranslation at
) b ON (a.id = b.attribute AND b.lang = 1)
JOIN (
SELECT count(*) AS langcount, at.attribute
FROM attributeTranslation at
GROUP BY at.attribute
) c ON (a.id = c.attribute)
See how that goes? You've generated a virtual table c containing two columns, joined it to the other two, used one of the columns for the ON clause, and returned the other as a column in your result set.
Let's say I have a mysql table called FISH with fields A, B and C.
I run SELECT * FROM FISH. This gets me a view with all fields. So, if A was a key in the original table, is it also a key in the view? Meaning, if I have a table FISH2, and I ran
SELECT * FROM (SELECT * FROM FISH) D, (SELECT * FROM FISH2) E WHERE D.A = E.A
Will the relevant fields still be keys?
Now, let's take this 1 step further. If I run
SELECT * FROM (SELECT CONCAT(A,B) AS DUCK, C FROM FISH) D, (SELECT CONCAT(A,B) AS DUCK2, C FROM FISH2) E WHERE D.DUCK = E.DUCK2
If A and B were keys in the original tables, will their concatenation also be a key?
Thanks :)
If A is a key in fish, any projection on fish only, will produce a resultset where A is still unique.
A join between table fish and any table with 1:1 relation (such as fish_type) will produce a result set where A is unique.
A join with another table that has 1:M or M:M relation from fish (such as fish_beits) will NOT produce a result where A is unique, unless you provide a filter predicate on the "other" side (such as bait='Dynamite').
SELECT * FROM (SELECT * FROM FISH) D, (SELECT * FROM FISH2) E WHERE D.A = E.A
...is logically equivalent to the following statement, and most databases (including MySQL) will perform the transformatiion:
select *
from fish
join fish2 on(fish.a = fish2.a)
Whether A is still unique in the resultset depends on the key of fish2 and their relation (see above).
Concatenation does not preserve uniqueness. Consider the following case:
concat("10", "10") => "1010"
concat("101", "0") => "1010"
Therefore, your final query...
SELECT *
FROM (SELECT CONCAT(A,B) AS DUCK, C FROM FISH) D
,(SELECT CONCAT(A,B) AS DUCK2, C FROM FISH2) E
WHERE D.DUCK = E.DUCK2
...won't (necessarily) produce the same result as
select *
from fish
join fish2 on(
fish.a = fish2.a
and fish.b = fish2.b
)
I wrote necessarily because the collisions depend on the actual values. I hunted down a bug about some time ago where the root cause was exactly this. The code had worked for several years before the bug manifested itself.
If by "key" you mean "unique", yes, tuples of a cartesian product over unique values will be unique.
(One can prove it via by reductio ad absurdum.)
For step 1, think of a view as a subquery containing everything in the AS clause when CREATE VIEW was executed.
For example, if view v is created as SELECT a, b, c FROM t, then when you execute...
SELECT * FROM v WHERE a = some_value
...it's conceptually treated as...
SELECT * FROM (SELECT a, b, c FROM t) WHERE a = some_value
Any database with a decent optimizer will notice that column a is passed straight into the results and that that it can take advantage of the indexing in t (if there is any) by moving it into the subquery:
SELECT * FROM (SELECT a, b, c FROM t WHERE a = some_value)
This all happens behind the scenes and is not an optimization you need to do yourself. Obviously, it can't do that for every condition in the WHERE clause, but understanding where you can is part of the art of writing a good optimizer.
For step 2, the concatenated keys will be part of intermediate results, and whether or not the database decides they need indexing is an implementation detail. Also note fche's comment about duplication.
If your database has a query plan explainer, running it and learning to interpret the results will give you a lot of insight about what makes your queries run fast and what slows them down.
Is there a way, using MySQL 5.0, to define a column in a table that is to be calculated whenever a select is executed on that particular row? For example say I have two tables, A and B:
A:
ID
COMPUTED_FIELD = SUM(SELECT B.SOME_VALUE FROM B WHERE B.A_ID = ID)
B:
ID
A_ID
SOME_VALUE
In the example, when I run a select on A for a particular ID, I want it to return the sum of all values in Table B for that particular value of A.ID. I know how to do this using multiple separate queries and doing a group by A_ID, but I'm trying to streamline the process a little.
Yes. You cannot do that inside a table, but you can do it in a view.
CREATE VIEW A
AS
SELECT SUM(B.SOME_VALUE) AS COMPUTED_FIELD
FROM B
WHERE B.A_ID = 'id';
Obviously id needs to be whatever you are searching for.
You don't need table A in this case.
Tables cannot contain calculated values. Try using views. The manual entry has full details. Your end result will come out looking something like: CREATE VIEW A AS SELECT SUM('SOME_VALUE'), B.A_ID FROM B; (Not tested)
That said, I'm not sure how important it is for you to have an independent, unique ID for table A -- this isn't possible unless you add another table C to hold the foreign keys referenced by B.A_ID, and use table C as a reference in creating your view.
As txwikinger suggests, the best way to do this is set up A as a view, not a table. Views are, for all intents and purposes, a streamlined, reusable query. They're generally used when a)a common query has a computed column, or b)to abstract away complex joins that are often used.
To expand on the previous answer, in order to be able to query A for any ID, try this view:
CREATE VIEW A
AS
SELECT B.A_ID AS ID, SUM(B.SOME_VALUE) AS COMPUTED_FIELD
FROM B
GROUP BY B.A_ID;
Then you can select into it normally, for example:
SELECT A.ID, A.COMPUTED_FIELD
FROM A
WHERE A.ID IN (10, 30);
or
SELECT A.ID, A.COMPUTED_FIELD
FROM A
WHERE COMPUTED_FIELD < 5;