I'm trying to perform a join operation on two tables linked by item IDs. However, the problem with them is that they've got columns with the same name as follows:
items
(ID, **Quantity**, etc. /*[nothing in etc. is shared by status' columns]*/)
status
(ID, **Quantity**, etc. /*[nothing in etc. is shared by items' columns]*/)
I want to get all records from these tables and join them, but I don't know what the SQL query would look like. I know it'd be something like:
SELECT *
FROM items
LEFT OUTER JOIN status
ON items.ID = status.ID
and I know I need aliases for the two quantity columns (which I know how to do), but where does the latter part of the query fit in?
In general, I recommend avoiding SELECT * queries. Just select the specific columns you need, and if there are duplicate column names you can easily assign aliases for them.
SELECT i.col1, i.col2, i.quantity AS item_quantity, s.col3, s.col4, s.quantity AS status_quantity
FROM items AS i
JOIN status AS s ON i.ID = s.ID
But if you really need to select all columns, you can use the solution in Marc B's answer.
Related
I need to write a query to join 3 tables.
My tables are:
ucommerce_customer
ucommerce_order
ucommerce_order_line
All 3 tables have a column called order_id.
The table ucommerce_order has a column called order_status.
When the order_status is set to "open" I want to display the order details.
ResultSet myRs = myStmt.executeQuery
("SELECT * FROM ucommerce_customer
INNER JOIN ucommerce_order
INNER JOIN ucommerce_order_line
WHERE ucommerce_order.order_status = 'open'");
My query ignores the order status and displays all orders, open and closed.
Also I have several products so ucommerce_order_line has several entries for the same order_id, my query displays duplicate entries and it duplicates the entire list as well.
How can I write a query that will show only open orders without duplicating everything?
In MySQL, the on/using clause is optional. This is very sad because someone can make mistakes like you did. Your question only mentions one column, so perhaps that is all that is needed for the join:
SELECT *
FROM ucommerce_customer INNER JOIN
ucommerce_order
USING (orderId) INNER JOIN
ucommerce_order_line
USING (OrderId)
WHERE ucommerce_order.order_status = 'open';
I would be surprised if the customer table really had a column called OrderId (seems like a bad idea in most situations), so the first USING clause might want to use CustomerId.
I would recommend to use a natural join instead. Maybe that's where the errors are coming from.
The duplicates can be removed by running SELECT DISTINCT * ...
i have two tables as below:
Table 1 "customer" with fields "Cust_id", "first_name", "last_name" (10 customers)
Table 2 "cust_order" with fields "order_id", "cust_id", (26 orders)
I need to display "Cust_id" "first_name" "last_name" "order_id"
to where i need count of order_id group by cust_id like list total number of orders placed by each customer.
I am running below query, however, it is counting all the 26 orders and applying that 26 orders to each of the customer.
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order, customer cus
GROUP BY cust_id;
Could you please suggest/advice what is wrong in the query?
You issue here is that you have told the database how these two tables are 'connected', or what they should be connected by:
Have a look at this image:
~IMAGE SOURCE
This effectively allows you to 'join' two tables together, and use a query between them.
so you might want to use something like:
SELECT COUNT(B.order_id), A.cust_id, A.first_name, A.last_name
FROM customer A
LEFT JOIN cust_order B //this is using a left join, but an inner may be appropriate also
ON (A.cust_id= B.Cust_id) //what links them together
GROUP BY A.cust_id; // the group by clause
As per your comment requesting some further info:
Left Join (right joins are almost identical, only the other way around):
The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table. This means that if the ON clause matches 0 (zero) records in right table, the join will still return a row in the result, but with NULL in each column from right table. ~Tutorials Point.
This means that a left join returns all the values from the left table, plus matched values from the right table or NULL in case of no matching join predicate.
LEFT joins will be used in the cases where you wish to retrieve all the data from the table in the left hand side, and only data from the right that match.
Execution Time
While the accepted answer in this case may work well in small datasets, it may however become 'heavy' in larger databases. This is because it was not actually designed for this type of operation.
This was the purpose of Joins to be introduced.
Much work in database-systems has aimed at efficient implementation of joins, because relational systems commonly call for joins, yet face difficulties in optimising their efficient execution. The problem arises because inner joins operate both commutatively and associatively. ~Wikipedia
In practice, this means that the user merely supplies the list of tables for joining and the join conditions to use, and the database system has the task of determining the most efficient way to perform the operation. A query optimizer determines how to execute a query containing joins. So, by allowing the dbms to choose the way your data is queried, you can save a lot of time.
Other Joins/Summary
AN INNER JOIN will return data from both tables where the keys in each table match
A LEFT JOIN or RIGHT JOIN will return all the rows from one table and matching data from the other table.
Use a join when you want to query multiple tables.
Joins are much faster than other ways of querying >=2 tables (speed can be seen much better on larger datasets).
You could try this one:
SELECT COUNT(cus_order.order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order cus_order, customer cus
WHERE cus_order.cust_id = cus.cust_id
GROUP BY cust_id;
Maybe an left join will help you
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name ]
FROM customer cus
LEFT JOIN cust_order co
ON (co.cust_id= cus.Cust_id )
GROUP BY cus.cust_id;
I am using the following JOIN statement:
SELECT *
FROM students2014
JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
ORDER BY students2014.LastName
to retrieve a list of students (students2014) and corresponding notes for each student stored in (notes2014).
Each student has multiple notes within the notes2014 table and each note has an ID that corresponds with each student's unique ID. The above statement is returning a the list of students but duplicating every student that has more than one note. I only want to display the latest note for each student (which is determined by the highest note ID).
Is this possible?
You need another join based on the MAX noteId you got from your select.
Something like this should do it (not tested; next time I'd recommed you to paste a link to http://sqlfiddle.com/ with your table structure and some sample data.
SELECT *
FROM students s
LEFT JOIN (
SELECT MAX(NoteId) max_id, NoteStudent
FROM notes
GROUP BY NoteStudent
) aux ON aux.NoteStudent = s.Student
LEFT JOIN notes n2 ON aux.max_id = n2.NoteId
If I may say so, the fact that a table is called students2014 is a big code smell. You'd be much better off with a students table and a year field, for many reasons (just a couple: you won't need to change your DB structure every year, querying across years is much, much easier, etc, etc). Perhaps you "inherited" this, but I thought I'd mention it.
GROUP the query by studentId and select the MAX of the noteId
Try :
SELECT
students2014.Student,
IFNULL(MAX(NoteId),0)
FROM students2014
LEFT JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
GROUP BY students2014.Student
ORDER BY students2014.LastName
I have a query which goes like this:
SELECT insanlyBigTable.description_short,
insanlyBigTable.id AS insanlyBigTable,
insanlyBigTable.type AS insanlyBigTableLol,
catalogpartner.id AS catalogpartner_id
FROM insanlyBigTable
INNER JOIN smallerTable ON smallerTable.id = insanlyBigTable.catalog_id
INNER JOIN smallerTable1 ON smallerTable1.catalog_id = smallerTable.id
AND smallerTable1.buyer_id = 'xxx'
WHERE smallerTable1.cont = 'Y' AND insanlyBigTable.type IN ('111','222','33')
GROUP BY smallerTable.id;
Now, when I run the query first time it copies the giant table into a temp table... I want to know how I can prevent that? I am considering a nested query, or even to reverse the join (not sure the effect would be to run faster), but that is well, not nice. Any other suggestions?
To figure out how to optimize your query, we first have to boil down exactly what it is selecting so that we can preserve that information while we change things around.
What your query does
So, it looks like we need the following
The GROUP BY clause limits the results to at most one row per catalog_id
smallerTable1.cont = 'Y', insanelyBigTable.type IN ('111','222','33'), and buyer_id = 'xxx' appear to be the filters on the query.
And we want data from insanlyBigTable and ... catalogpartner? I would guess that catalogpartner is smallerTable1, due to the id of smallerTable being linked to the catalog_id of the other tables.
I'm not sure on what the purpose of including the buyer_id filter on the ON clause was for, but unless you tell me differently, I'll assume the fact it is on the ON clause is unimportant.
The point of the query
I am unsure about the intent of the query, based on that GROUP BY statement. You will obtain just one row per catalog_id in the insanelyBigTable, but you don't appear to care which row it is. Indeed, the fact that you can run this query at all is due to a special non-standard feature in MySQL that lets you SELECT columns that do not appear in the GROUP BY statement... however, you don't get to select WHICH columns. This means you could have information from 4 different rows for each of your selected items.
My best guess, based on column names, is that you are trying to bring back a list of items that are in the same catalog as something that was purchased by a given buyer, but without any more than one item per catalog. In addition, you want something to connect back to the purchased item in that catalog, via the catalogpartner table's id.
So, something probably akin to amazon's "You may like these items because you purchased these other items" feature.
The new query
We want 1 row per insanlyBigTable.catalog_id, based on which catalog_id exists in smallerTable1, after filtering.
SELECT
ibt.description_short,
ibt.id AS insanlyBigTable,
ibt.type AS insanlyBigTableLol,
(
SELECT smallerTable1.id FROM smallerTable1 st
WHERE st.buyer_id = 'xxx'
AND st.cont = 'Y'
AND st.catalog_id = ibt.catalog_id
LIMIT 1
) AS catalogpartner_id
FROM insanlyBigTable ibt
WHERE ibt.id IN (
SELECT (
SELECT ibt.id AS ibt_id
FROM insanlyBigTable ibt
WHERE ibt.catalog_id = sti.catalog_id
LIMIT 1
) AS ibt_id
FROM (
SELECT DISTINCT(catalog_id) FROM smallerTable1 st
WHERE st.buyer_id = 'xxx'
AND st.cont = 'Y'
AND EXISTS (
SELECT * FROM insanlyBigTable ibt
WHERE ibt.type IN ('111','222','33')
AND ibt.catalog_id = st.catalog_id
)
) AS sti
)
This query should generate the same result as your original query, but it breaks things down into smaller queries to avoid the use (and abuse) of the GROUP BY clause on the insanlyBigTable.
Give it a try and let me know if you run into problems.
I'm doing several MySQL joins to get template variables (i.e. custom fields) and their values (in MODX Evo but it's irrelevant - this is a general MySQL query).
I'm looking ideally to be able to create 2 temporary columns in order to use SORT BY in the query, or something to this effect. I'd like to populate the values for 'event_date' and 'event_featured' for their corresponding id's in these new columns - then I could then sort the results by these columns.
On a very related note I would like to limit the results to 20 for each unique id, not for each row as would happen if I added LIMIT- it would crop the below result to the . Can this be accomplished at the same time?
Anybody know how / if these are possible? Many thanks in advance.
Code and image of the results below:
SELECT DISTINCT
content.id, content.pagetitle, content.template , content.published,
templates.templatename,
tv_props.name,
tv_values.value
FROM `modx_site_content` AS `content`
LEFT JOIN `modx_site_templates` AS `templates` ON content.template=templates.id
LEFT JOIN `modx_site_tmplvar_templates` AS `template_tvs` ON templates.id=template_tvs.templateid
LEFT JOIN `modx_site_tmplvars` AS `tv_props` ON template_tvs.tmplvarid=tv_props.id
LEFT JOIN `modx_site_tmplvar_contentvalues` AS `tv_values` ON template_tvs.tmplvarid=tv_values.tmplvarid
WHERE templates.id=89
AND (
tv_props.name='event_featured'
OR tv_props.name='event_link_through'
OR tv_props.name='event_title'
OR tv_props.name='event_date'
OR tv_props.name='event_date_text'
OR tv_props.name='event_short_description'
OR tv_props.name='event_list_image'
);
Link to full-size image
You're going to need a couple of virtual tables, also known as subqueries, to retrieve these two properties of events from your name/value table. The generic name for this kind of query is a "pivot," for your information.
The mental knack is to think of the subquery as a virtual table which you can use in a surrounding query. The subquery for event_date looks like this, I believe.
SELECT content.id AS id,
tv_values.value AS event_date
FROM modx_site_content AS content
LEFT JOIN modx_site_templates AS templates
ON content.template=templates.id
LEFT JOIN modx_site_tmplvar_templates AS template_tvs
ON templates.id=template_tvs.templateid
LEFT JOIN modx_site_tmplvars AS tv_props
ON template_tvs.tmplvarid=tv_props.id
LEFT JOIN modx_site_tmplvar_contentvalues AS tv_values
ON template_tvs.tmplvarid=tv_values.tmplvarid
WHERE tv_props.name = 'event_date'
This little query produces a resultset that's a table relating content id to event date. I honestly don't understand your schema well enough to know if there's just one event date for each content id, so you might need to adjust this query to SELECT more columns. As you debug this, you should try out the subquery and make sure it's giving the results you hope for.
Then, when you're sure the subquery is OK, you join that subquery into your overall query, generically like so.
SELECT DISTINCT
content.id, event_date.event_date, templates.column,
table.column, table.colum, etc, etc
FROM modx_site_content AS content
LEFT JOIN table ON condition
LEFT JOIN (
SELECT content.id AS id,
tv_values.value AS event_date
FROM modx_site_content AS content
LEFT JOIN modx_site_templates AS templates
ON content.template=templates.id
LEFT JOIN modx_site_tmplvar_templates AS template_tvs
ON templates.id=template_tvs.templateid
LEFT JOIN modx_site_tmplvars AS tv_props
ON template_tvs.tmplvarid=tv_props.id
LEFT JOIN modx_site_tmplvar_contentvalues AS tv_values
ON template_tvs.tmplvarid=tv_values.tmplvarid
WHERE tv_props.name = 'event_date'
) AS event_date ON event_date.id = content.id
LEFT JOIN etc, etc, etc.
WHERE etc etc etc
Do you see how that goes? You can use tablename AS table or (some query) AS table interchangeably. You can also define a VIEW in your schema that provides the same data, and name it in your query. That's a handy way to make your queries less hairy.
By the way, you'll boost performance if you change
AND (
tv_props.name='event_featured'
OR tv_props.name='event_link_through'
OR tv_props.name='event_title' etc )
to
AND tv.props.name IN ('event_featured',
'event_link_through',
'event_title', etc)
You've probably noticed I'm a bit of a stickler for indentation in SQL queries. I find this helpful; I often find mistakes while I'm fixing up the indentation. Your practice may vary.