SQL join query - join instead of inline query - mysql

I would like to use join instead of inline queries but the one I was able to make is giving wrong values.
Please check this link - http://www.sqlfiddle.com/#!2/57cad/9
It has 2 individual queries which give correct values and a join query which gives an incorrect result.
Can someone please help...

Answer shown here: SQLFiddle
Your queries could be improved upon in a few areas which make their JOINing a little more obvious.
In your first query, a better version has the the GROUP BY clause's columns listed in the SELECT clause and your HAVING clause (while working) becomes the WHERE clause (IMO: The best practice is to use aggregate function only in the HAVING clause:)
SELECT usercode, ROUND(coalesce(sum(paymentamount)*0.99,0),2) AS payment
FROM accountpayments
WHERE usercode = 21
GROUP BY usercode;
Your second query can be rewritten as a JOIN (vs. the subquery)
SELECT campaigns.usercode, ROUND(coalesce(sum(lmc_cds.total_spending),0),2) AS total_spending
FROM logsmaincontrols_campaigns_daily_stats AS lmc_cds
JOIN campaigns
ON campaigns.campcode = lmc_cds.campcode
WHERE campaigns.usercode = 21;
Since the queries don't share any tables, I decided to JOIN the queries to each other as derived tables using the usercode as the JOINing column.
SELECT t1.usercode, t1.payment, t2.total_spending
FROM (SELECT usercode, ROUND(coalesce(sum(paymentamount)*0.99,0),2) AS payment
FROM accountpayments
WHERE usercode = 21
GROUP BY usercode) AS t1
JOIN (SELECT campaigns.usercode, ROUND(coalesce(sum(lmc_cds.total_spending),0),2) AS total_spending
FROM logsmaincontrols_campaigns_daily_stats AS lmc_cds
JOIN campaigns
ON campaigns.campcode = lmc_cds.campcode
WHERE campaigns.usercode = 21) AS t2
ON t1.usercode = t2.usercode;

Related

SQL LEFT JOIN with possible join condition duplicate match

Here is my query so far:
SELECT
b.cs_bidding_id, b.cs_bidding_user_id,
floor(AVG(u.cs_rating)) AS cs_user_rating
FROM cs_biddings b LEFT JOIN
cs_user_ratings u ON u.cs_user_rated_id = b.cs_bidding_user_id
I would like to get the avg rating of the user's per bidding post.
However this does not work for multiple biddings because whenever the join condition is satisfied, it wont let me fetch other avg rating for other biddings that shares the same bidding_user_id
desired result:
Unsummarized Query:
SELECT
b.cs_bidding_id,
b.cs_bidding_title,
b.cs_bidding_details,
b.cs_bidding_user_id,
b.cs_bidding_permalink,
b.cs_bidding_added,
b.cs_bidding_picture,
b.cs_bidding_status,
b.cs_bidding_location,
floor(AVG(u.cs_rating)) AS cs_owner_rating
FROM cs_biddings b LEFT JOIN
cs_user_ratings u ON u.cs_user_rated_id = b.cs_bidding_user_id
Your query is malformed. It is an aggregation query (because of the AVG()) but the SELECT columns are inconsistent with the aggregation columns (well, there are none of those).
Fixing the group by might fix your problem:
SELECT b.cs_bidding_id, b.cs_bidding_user_id,
floor(AVG(u.cs_rating)) AS cs_user_rating
FROM cs_biddings b LEFT JOIN
cs_user_ratings u
ON u.cs_user_rated_id = b.cs_bidding_user_id
GROUP BY b.cs_bidding_id, b.cs_bidding_user_id;
I'm not sure if you want both columns in the GROUP BY (and the result set). However, without sample data and desired results, it is unclear what you actually intend.

mysql request retrieve data combined with 2 tables

I have three tables
I would like to request all persons, where the companyTypeID is for example '2'. How can I query that?
You could use the in operator:
SELECT *
FROM persons
WHERE CompanyId IN (SELECT CompanyId
FROM company
WHERE CompanyTypeId = 2)
Do an INNER JOIN (left or right joins are functionally similar, the only difference is which side of the equation is honoured). Nested queries / subqueries are extremely expensive if they become dependent in nature—even though that's not the scenario in your case—and I do not recommend using them for large tables.
SELECT t1.*
FROM Persons AS t1
LEFT JOIN Company AS t2 ON
t2.companyTypeID = t1.CompanyID
To ensure that you are using an index for joining, you should create indexes on the companyTypeID and CompanyID columns of each table. Prepend EXPLAIN EXTENDED to the query above to verify that the indexes are indeed being used.
You need to use SQL statement JOIN. It's all about mathematical sets!
A Graphic (and superficial) explanation about JOINs statement under mathematical sets approach:
https://www.google.com.br/imgres?imgurl=http%3A%2F%2Fi.imgur.com%2FhhRDO4d.png&imgrefurl=https%3A%2F%2Fwww.reddit.com%2Fr%2Fprogramming%2Fcomments%2F1xlqeu%2Fsql_joins_explained_xpost_rsql%2F&docid=q4Ank7XVw8j7DM&tbnid=f-L_7a3HkxW_3M%3A&w=1000&h=740&client=ubuntu&bih=878&biw=1745&ved=0ahUKEwj2oaer5evPAhUBFZAKHe4nAdAQMwgdKAEwAQ&iact=mrc&uact=8#h=740&w=1000
SQL for your problem(If I got this):
SELECT * FROM Persons p LEFT JOIN Company c ON c.ID = p.companyID
LEFT JOIN CompanyType ct ON ct.ID = c.companyTypeID
WHERE c.id = 2;
Why 'LEFT JOIN'? Checkout the the link about set approach explanation above!
The second 'LEFT JOIN' is just to bring the description of companyType table.
The "*" in statement is didatic! You must not use this in production for the good of performance. Therefore, you must to replace the '*' with all the fields you need. For example, "CONCAT(p.firstname,CONCAT(" ",p.lastname)) as PersonName, c.name CompanyName,..." or something like that. I suppose you're using MySQL
Hope I've helped!
Perform an SQL join:
Joins are quicker than the subquery, in the other post:
SELECT Persons.firstname AS first name
FROM Persons
JOIN Company ON company.ID == Persons.CompanyID
WHERE Company.companyTypeID == 2
Although you will need to select all he fields you want, using alias to simplify the names

SQL query for large amount of data with many joins

I have written a sql query for my requirement.
This is working fine for me. This is taking 0.0006 sec to execute.
I want to know from sql experts "will this work fine with large amount of data?".
I have written my query below.
SELECT HM_customers.id,
HM_customers.username,
HM_customers.firstname,
HM_customers.lastname,
HM_customers.company,
HM_customers_address_bank.field_data
FROM HM_orders
JOIN HM_order_items
ON HM_order_items.order_id = HM_orders.id
JOIN HM_bid
ON HM_order_items.bid_id = HM_bid.bid_id
JOIN HM_customers
ON HM_bid.user_id = HM_customers.id
JOIN HM_customers_address_bank
ON HM_customers_address_bank.id = HM_customers.default_billing_address
WHERE HM_orders.id = '4'
Any expert can advice me or let me know how can I improve this query. Please suggest me if any issue in this query.
NOTE:- This is a simple query. But I want to know, will this work with large amount of data with less time
You don't need to include the orders table:
SELECT c.id,
c.username,
c.firstname,
c.lastname,
c.company,
cb.field_data
FROM HM_order_items oi
JOIN HM_bid b
ON oi.bid_id = b.bid_id
JOIN HM_customers c
ON b.user_id = c.id
JOIN HM_customers_address_bank cb
ON cb.id = c.default_billing_address
WHERE oi.order_id = '4';
Your query can also result in duplicate rows, if a customer bids on the same items multiple times. If you put in a select distinct, then you will incur overhead of duplicate elimination. If this becomes a problem, you will probably want to restructure the query as an exists.
There are few points worth noting
1) The reference to an outer table column in the WHERE clause prevents the OUTER JOIN from returning any non-matched rows, which implicitly converts the query to an INNER JOIN. This is probably a bug in the query or a misunderstanding of how OUTER JOIN works.
2) Selecting all columns with the * wildcard will cause the query's meaning and behavior to change if the table's schema changes, and might cause the query to retrieve too much data. You should only choose columns you need.
Please make your driven table to 'HM_customers' as all your data is coming from this table and change your join like this way, hopefully this will help you :)
SELECT hmCust.id,
hmCust.username,
hmCust.firstname,
hmCust.lastname,
hmCust.company,
hmCustAdd.field_data
FROM HM_customers hmCust
INNER JOIN HM_bid hmBid
ON hmBid.user_id = hmCust.id
INNER JOIN HM_customers_address_bank hmCustAdd
ON hmCustAdd.id = hmCust.default_billing_address
INNER JOIN HM_order_items hmOrderItem
ON hmOrderItem.order_id = hmBid.bid_id
INNER JOIN HM_orders hmOrder
ON hmOrder.id = hmOrderItem.order_id
WHERE hmOrder.id = '4'

MYSQL many to many 3 tables query

EDIT. I missed the one main issue I was having. I want to display all the unique 'device_MAC' rows. So I want this query to output 3 rows (as per the original query). The issue I am having is connecting the data table to the remote_node table via dt_short = rn_short where the maximum timestamp for dt_short in the data table.
I am having trouble running a query on 3 tables (2 have many to many relations).
What I am trying to do:
Get each distinct rn_IEEE from the remotenodes table with the maximum timestamp (in the example this will get 3 rows with 3 distinct short addresses rn_short)
Join with the devicenames table on device_IEEE
Get each distinct dt_short from the data table with the maximum timestamp
Join dt_short with rn_short from the query above
Now the problem I am running into is that I can do the queries for the above individually, I have even gotten the first 3 of them together into a query but I cannot seem to properly join the last bit of data to get the result that I want.
I have been going in circles trying to solve this. Here is a link to SQL Fiddle which contains all the test data and the query as far as I got it, it does what i want for the first line but from table 'data' after the first line is NULL:
See this SQL fiddle
After going through your requirements and the data, it looks like you just need to change your query to include an INNER JOIN on the data table instead of a LEFT JOIN
See SQL Fiddle with Demo
select rn.*, dn.*, d.*
from remotenodes rn
inner join devicenames dn
on rn.rn_IEEE = dn.device_IEEE
and rn.rn_timestamp = (SELECT MAX(rn_timestamp) FROM remotenodes
WHERE rn.rn_IEEE = rn_IEEE
GROUP BY rn_IEEE)
inner join data d
on rn.rn_short = d.dt_short
AND d.dt_timestamp = (SELECT MAX(d2.dt_timestamp) AS ts
FROM data d2
WHERE d.dt_short = d2.dt_short
GROUP BY d2.dt_short)
what you have done the query in your SQL fiddle is right.Instead of using left join use inner join so that it will give you the first row
cheers.
Thanks for all your answers everyone. I managed to solve the problem by using views.
It's not the most efficient way but I think it will do for now.
Here is the SQL Fiddle link:
http://sqlfiddle.com/#!2/4076e/8
Try this query, for me its returning one row:
SELECT rn_short, rn_IEEE, device_name
FROM
(SELECT DISTINCTROW dt_short FROM (SELECT * FROM `data` ORDER BY `dt_timestamp` DESC) as data ) as a
JOIN
(SELECT rn_IEEE, rn_short, device_name FROM devicenames dn JOIN (SELECT DISTINCTROW rn_IEEE, rn_short FROM (SELECT * FROM `remotenodes` ORDER BY `rn_timestamp` DESC) as remotenodes GROUP BY rn_IEEE) as rn ON dn.device_IEEE = rn.rn_IEEE) as b
ON a.dt_short = b.rn_short

In what order are MySQL JOINs evaluated?

I have the following query:
SELECT c.*
FROM companies AS c
JOIN users AS u USING(companyid)
JOIN jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123;
I have the following questions:
Is the USING syntax synonymous with ON syntax?
Are these joins evaluated left to right? In other words, does this query say: x = companies JOIN users; y = x JOIN jobs; z = y JOIN useraccounts;
If the answer to question 2 is yes, is it safe to assume that the companies table has companyid, userid and jobid columns?
I don't understand how the WHERE clause can be used to pick rows on the companies table when it is referring to the alias "j"
Any help would be appreciated!
USING (fieldname) is a shorthand way of saying ON table1.fieldname = table2.fieldname.
SQL doesn't define the 'order' in which JOINS are done because it is not the nature of the language. Obviously an order has to be specified in the statement, but an INNER JOIN can be considered commutative: you can list them in any order and you will get the same results.
That said, when constructing a SELECT ... JOIN, particularly one that includes LEFT JOINs, I've found it makes sense to regard the third JOIN as joining the new table to the results of the first JOIN, the fourth JOIN as joining the results of the second JOIN, and so on.
More rarely, the specified order can influence the behaviour of the query optimizer, due to the way it influences the heuristics.
No. The way the query is assembled, it requires that companies and users both have a companyid, jobs has a userid and a jobid and useraccounts has a userid. However, only one of companies or user needs a userid for the JOIN to work.
The WHERE clause is filtering the whole result -- i.e. all JOINed columns -- using a column provided by the jobs table.
I can't answer the bit about the USING syntax. That's weird. I've never seen it before, having always used an ON clause instead.
But what I can tell you is that the order of JOIN operations is determined dynamically by the query optimizer when it constructs its query plan, based on a system of optimization heuristics, some of which are:
Is the JOIN performed on a primary key field? If so, this gets high priority in the query plan.
Is the JOIN performed on a foreign key field? This also gets high priority.
Does an index exist on the joined field? If so, bump the priority.
Is a JOIN operation performed on a field in WHERE clause? Can the WHERE clause expression be evaluated by examining the index (rather than by performing a table scan)? This is a major optimization opportunity, so it gets a major priority bump.
What is the cardinality of the joined column? Columns with high cardinality give the optimizer more opportunities to discriminate against false matches (those that don't satisfy the WHERE clause or the ON clause), so high-cardinality joins are usually processed before low-cardinality joins.
How many actual rows are in the joined table? Joining against a table with only 100 values is going to create less of a data explosion than joining against a table with ten million rows.
Anyhow... the point is... there are a LOT of variables that go into the query execution plan. If you want to see how MySQL optimizes its queries, use the EXPLAIN syntax.
And here's a good article to read:
http://www.informit.com/articles/article.aspx?p=377652
ON EDIT:
To answer your 4th question: You aren't querying the "companies" table. You're querying the joined cross-product of ALL four tables in your FROM and USING clauses.
The "j.jobid" alias is just the fully-qualified name of one of the columns in that joined collection of tables.
In MySQL, it's often interesting to ask the query optimizer what it plans to do, with:
EXPLAIN SELECT [...]
See "7.2.1 Optimizing Queries with EXPLAIN"
Here is a more detailed answer on JOIN precedence. In your case, the JOINs are all commutative. Let's try one where they aren't.
Build schema:
CREATE TABLE users (
name text
);
CREATE TABLE orders (
order_id text,
user_name text
);
CREATE TABLE shipments (
order_id text,
fulfiller text
);
Add data:
INSERT INTO users VALUES ('Bob'), ('Mary');
INSERT INTO orders VALUES ('order1', 'Bob');
INSERT INTO shipments VALUES ('order1', 'Fulfilling Mary');
Run query:
SELECT *
FROM users
LEFT OUTER JOIN orders
ON orders.user_name = users.name
JOIN shipments
ON shipments.order_id = orders.order_id
Result:
Only the Bob row is returned
Analysis:
In this query the LEFT OUTER JOIN was evaluated first and the JOIN was evaluated on the composite result of the LEFT OUTER JOIN.
Second query:
SELECT *
FROM users
LEFT OUTER JOIN (
orders
JOIN shipments
ON shipments.order_id = orders.order_id)
ON orders.user_name = users.name
Result:
One row for Bob (with the fulfillment data) and one row for Mary with NULLs for fulfillment data.
Analysis:
The parenthesis changed the evaluation order.
Further MySQL documentation is at https://dev.mysql.com/doc/refman/5.5/en/nested-join-optimization.html
SEE http://dev.mysql.com/doc/refman/5.0/en/join.html
AND start reading here:
Join Processing Changes in MySQL 5.0.12
Beginning with MySQL 5.0.12, natural joins and joins with USING, including outer join variants, are processed according to the SQL:2003 standard. The goal was to align the syntax and semantics of MySQL with respect to NATURAL JOIN and JOIN ... USING according to SQL:2003. However, these changes in join processing can result in different output columns for some joins. Also, some queries that appeared to work correctly in older versions must be rewritten to comply with the standard.
These changes have five main aspects:
The way that MySQL determines the result columns of NATURAL or USING join operations (and thus the result of the entire FROM clause).
Expansion of SELECT * and SELECT tbl_name.* into a list of selected columns.
Resolution of column names in NATURAL or USING joins.
Transformation of NATURAL or USING joins into JOIN ... ON.
Resolution of column names in the ON condition of a JOIN ... ON.
Im not sure about the ON vs USING part (though this website says they are the same)
As for the ordering question, its entirely implementation (and probably query) specific. MYSQL most likely picks an order when compiling the request. If you do want to enforce a particular order you would have to 'nest' your queries:
SELECT c.*
FROM companies AS c
JOIN (SELECT * FROM users AS u
JOIN (SELECT * FROM jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123)
)
as for part 4: the where clause limits what rows from the jobs table are eligible to be JOINed on. So if there are rows which would join due to the matching userids but don't have the correct jobid then they will be omitted.
1) Using is not exactly the same as on, but it is short hand where both tables have a column with the same name you are joining on... see: http://www.java2s.com/Tutorial/MySQL/0100__Table-Join/ThekeywordUSINGcanbeusedasareplacementfortheONkeywordduringthetableJoins.htm
It is more difficult to read in my opinion, so I'd go spelling out the joins.
3) It is not clear from this query, but I would guess it does not.
2) Assuming you are joining through the other tables (not all directly on companyies) the order in this query does matter... see comparisons below:
Origional:
SELECT c.*
FROM companies AS c
JOIN users AS u USING(companyid)
JOIN jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123
What I think it is likely suggesting:
SELECT c.*
FROM companies AS c
JOIN users AS u on u.companyid = c.companyid
JOIN jobs AS j on j.userid = u.userid
JOIN useraccounts AS us on us.userid = u.userid
WHERE j.jobid = 123
You could switch you lines joining jobs & usersaccounts here.
What it would look like if everything joined on company:
SELECT c.*
FROM companies AS c
JOIN users AS u on u.companyid = c.companyid
JOIN jobs AS j on j.userid = c.userid
JOIN useraccounts AS us on us.userid = c.userid
WHERE j.jobid = 123
This doesn't really make logical sense... unless each user has their own company.
4.) The magic of sql is that you can only show certain columns but all of them are their for sorting and filtering...
if you returned
SELECT c.*, j.jobid....
you could clearly see what it was filtering on, but the database server doesn't care if you output a row or not for filtering.