I have read a few post on this, but not seeming to be able to fix my problem.
I am calling two database queries to populate two array's that run along side by side of each other, but they aren't matching, as the order that they come out is different. I believe i have something to do with the Group By, and this may require a sub query, but again a little lost...
Query 1:
SELECT count(bids_bid.total_bid), bidtime_bid, users_usr.company_usr, users_usr.id_usr
FROM bids_bid
INNER JOIN users_usr
ON bids_bid.user_bid = users_usr.id_usr
WHERE auction_bid = 36
GROUP BY user_bid
ORDER BY bidtime_bid ASC
Query 2:
SELECT auction_bid, user_bid, bidtime_bid, bids_bid.total_bid
FROM bids_bid
WHERE auction_bid = 36
ORDER BY bidtime_bid ASC
Even though the 'Order by' is the same the results aren't matching. The users are coming out in a different sequence.
I hope this makes sense, and thanks in advance.
* Update *
I just wanted to add a bit of clarity on what the output I want is. I need to only show 1 result by one user (user_bid) the second query show all users rows. I only need the first one to show the first row entered for each user. So if I could order before the the group and by min date, that would be ace...
It's to be expected. You're fetching fields that are NOT involved in the grouping, and are not part of an aggregate function. MySQL allows such things, but generally the results of the ungrouped/unaggregated functions can be wonky.
Because MySQL is free to chose WHICH of the potentially multiple 'free' rows to choose for the actual result row, you will get different results. Generally it picks the first-encountered 'free choice' result, but that's not defined/guaranteed.
You use grouping when you want unique results in result set according to some
group id (column name). usually grouping is used with aggregate functions such as
(min, max,count,sum..).
Ordering or inner query is nothing to do with result set, i suggest read some introductory
tutorials about grouping and think/treat Sql as a set based language and most of the set theory is applied on sql you'll be fine.
So I was complicating issues that I didn't need to. The solution I found was before.
SELECT users_usr.company_usr,
users_usr.id_usr,
bids_bid.bidtime_bid, min(bidtime_bid) as minbid FROM bids_bid INNER JOIN users_usr ON bids_bid.user_bid = users_usr.id_usr
WHERE auction_bid = 36
GROUP BY id_usr
ORDER BY minbid ASC
Thanks everyone for making me look (try) harder...
Related
I am currently experiencing a (to me) very strange behaviour for one of my mysql 5.6 queries.
I have a given system I am trying to optimize. One step is to only select the fields necessary for the next operation.
The given query looks as follows:
SELECT oxv_oxcategories_6_fr.*
FROM oxv_oxobject2category_6 AS oxobject2category
LEFT JOIN oxv_oxcategories_6_fr ON oxv_oxcategories_6_fr.oxid =
oxobject2category.oxcatnid
WHERE oxobject2category.oxobjectid = '<hashed id>'
AND oxv_oxcategories_6_fr.oxid IS NOT NULL
AND (oxv_oxcategories_6_fr.oxactive = 1
AND oxv_oxcategories_6_fr.oxhidden = '0')
ORDER BY oxobject2category.oxtime
I have taken the libery to use more sensible naming in my own query:
SELECT
category_view.*
FROM oxv_oxobject2category_6 category_mapping_view
LEFT JOIN oxv_oxcategories_6_fr category_view ON category_view.OXID =
category_mapping_view.OXCATNID
WHERE category_mapping_view.OXOBJECTID = '<hashed id>'
AND category_view.OXID IS NOT NULL
AND (category_view.OXACTIVE = 1
AND category_view.OXHIDDEN = '0')
ORDER BY category_mapping_view.OXTIME
As you can see, there is not much difference, only the naming is different. So far, everything works as expected. Now I am trying to only select the values I need. So the query looks like this:
SELECT
category_view.OXID,
category_view.OXTITLE
FROM oxv_oxobject2category_6 category_mapping_view
LEFT JOIN oxv_oxcategories_6_fr category_view ON category_view.OXID =
category_mapping_view.OXCATNID
WHERE category_mapping_view.OXOBJECTID = '<hashed id>'
AND category_view.OXID IS NOT NULL
AND (category_view.OXACTIVE = 1
AND category_view.OXHIDDEN = '0')
ORDER BY category_mapping_view.OXTIME;
This also works as expected. But, I also need the field OXPARENTID, so I change the SELECT statement to
category_view.OXID,
category_view.OXTITLE,
category_view.OXPARENTID
Now the order of the items is different and I cannot seem to find out why that is. The new as well as the original query both sort for OXTIME without that field being present in the final result set. There are about 10 entries where OXTIME is 0, and it is those items that get turned around (ordering-wise) as soon as I query for OXPARENTID.
In the original query, OXPARENTID is present as well, so why does it make a difference now? I am guessing that there is some sort of ordering logic going on I do not yet know about.
Mind, that both joined tables are actually views, maybe that has something to do with it. Also, OXID and OXPARENTID are both md5 hashed values.
Any help would be greatly appreciated.
EDIT
In order to clarify, I know that the fact that multiple entries have OXTIME equal 0 makes it impossible to predict beforehand, which entry will be the top one. However, I still expected the order of the entries to be the same every time I call the query (regardless of what I am selecting).
One answer (#GordonLinoff) explains, that
[...] the same query can return the results in different order on different runs
Where does this "randomness" come from?
Your ordering is:
ORDER BY category_mapping_view.OXTIME;
And then you state:
There are about 10 entries where OXTIME is 0, and it is those items that get turned around (ordering-wise) as soon as I query for OXPARENTID.
What you have are ties in the keys. The results can be in any order -- and the same query can return the results in different order on different runs. Technically, the ordering in SQL is unstable.
You can fix this by including another column in the ORDER BY so each row is uniquely defined by the ORDER BY keys. Perhaps that is OXID:
ORDER BY category_mapping_view.OXTIME, category_view.OXID;
By the way, it is "obvious" that sorting in SQL is unstable. Why? SQL tables represent unordered sets. There is no ordering to fall back on when the keys are the same.
To calculate the price of invoices (that have *invoice item*s in a separate table and linked to the invoices), I had written this query:
SELECT `i`.`id`, SUM(ii.unit_price * ii.quantity) invoice_price
FROM (`invoice` i)
JOIN `invoiceitem` ii
ON `ii`.`invoice_id` = `i`.`id`
WHERE `i`.`user_id` = '$user_id'
But it only resulted ONE row.
After research, I got that I had to have GROUP BY i.id at the end of the query. With this, the results were as expected.
From my opinion, even without GROUP BY i.id, nothing is lost and it should work well!
Please in some simple sentences tell me...
Why should I always use the additional!!! GROUP BY i.id, What is lost without it, and maybe as the most functioning question, How should I remember that I have lost the additional GROUP BY?!
You have to include the group by because there are many IDs that went into the sum. If you don't specify it then MySQL just picks the first one, and sums across the entire result set. GroupBy tells MySQL to sum (or generically aggregate) for each Grouped By Entity.
Why should I always use GROUP BY?
SUM() and others are Aggregate Functions. Their very nature requires that they be used in combination with GROUP BY.
What is lost without it?
From the documentation:
If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows.
In the end, there is nothing to remember, as these are GROUP BY aggregate functions. You will quickly tell from the result that you have forgotten GROUP BY when the result includes the entire result set (incorrectly), instead of your grouped subsets.
Is it possible to have count in the select clause with a group by which is suppressed in the count? I need the count to ignore the group by clause
I got this query which is counting the total entries. The query is generic generated and therefore I can't make any comprehensive changes like subqueries etc.
In some specific cases a group by is needed to retrieve the correct rows and because of this the group by can't be removed
SELECT count(dv.id) num
FROM `data_voucher` dv
LEFT JOIN `data_voucher_enclosure` de ON de.data_voucher_id=dv.id
WHERE IF(de.id IS NULL,0,1)=0
GROUP BY dv.id
Is it possible to have count in the select clause with a group by which is suppressed in the count? I need the count to ignore the group by clause
well, the answer to your question is simply you can't have an aggregate that works on all the results, while having a group by statement. That's the whole purpose of the group by to create groups that change the behaviour of aggregates:
The GROUP BY clause causes aggregations to occur in groups (naturally) for the columns you name.
cf this blog post which is only the first result I found on google on this topic.
You'd need to redesign your query, the easiest way being to create a subquery, or a hell of a jointure. But without the schema and a little context on what you want this query to do, I can't give you an alternative that works.
I just can tell you that you're trying to use a hammer to tighten a screw...
Have found an alternative where COUNT DISTINCT is used
SELECT count(distinct dv.id) num
FROM `data_voucher` dv
LEFT JOIN `data_voucher_enclosure` de ON de.data_voucher_id=dv.id
WHERE IF(de.id IS NULL,0,1)=0
I have followed the tutorial over at tizag for the MAX() mysql function and have written the query below, which does exactly what I need. The only trouble is I need to JOIN it to two more tables so I can work with all the rows I need.
$query = "SELECT idproducts, MAX(date) FROM results GROUP BY idproducts ORDER BY MAX(date) DESC";
I have this query below, which has the JOIN I need and works:
$query = ("SELECT *
FROM operators
JOIN products
ON operators.idoperators = products.idoperator JOIN results
ON products.idProducts = results.idproducts
ORDER BY drawndate DESC
LIMIT 20");
Could someone show me how to merge the top query with the JOIN element from my second query? I am new to php and mysql, this being my first adventure into a computer language I have read and tried real hard to get those two queries to work, but I am at a brick wall. I cannot work out how to add the JOIN element to the first query :(
Could some kind person take pity on a newb and help me?
Try this query.
SELECT
*
FROM
operators
JOIN products
ON operators.idoperators = products.idoperator
JOIN
(
SELECT
idproducts,
MAX(date)
FROM results
GROUP BY idproducts
) AS t
ON products.idproducts = t.idproducts
ORDER BY drawndate DESC
LIMIT 20
JOINs function somewhat independently of aggregation functions, they just change the intermediate result-set upon which the aggregate functions operate. I like to point to the way the MySQL documentation is written, which hints uses the term 'table_reference' in the SELECT syntax, and expands on what that means in JOIN syntax. Basically, any simple query which has a table specified can simply expand that table to a complete JOIN clause and the query will operate the same basic way, just with a modified intermediate result-set.
I say "intermediate result-set" to hint at the mindset which helped me understand JOINS and aggregation. Understanding the order in which MySQL builds your final result is critical to knowing how to reliably get the results you want. Generally, it starts by looking at the first row of the first table you specify after 'FROM', and decides if it might match by looking at 'WHERE' clauses. If it is not immediately discardable, it attempts to JOIN that row to the first JOIN specified, and repeats the "will this be discarded by WHERE?". This repeats for all JOINs, which either add rows to your results set, or remove them, or leaves just the one, as appropriate for your JOINs, WHEREs and data. This process builds what I am referring to when I say "intermediate result-set". Somewhere between starting and finishing your complete query, MySQL has in it's memory a potentially massive table-like structure of data which it built using the process I just described. Only then does it begin to aggregate (GROUP) the results according to your criteria.
So for your query, it depends on what specifically you are going for (not entirely clear in OP). If you simply want the MAX(date) from the second query, you can simply add that expression to the SELECT clause and then add an aggregation spec to the end:
SELECT *, MAX(date)
FROM operators
...
GROUP BY idproducts
ORDER BY ...
Alternatively, you can add the JOIN section of the second query to the first.
I have a simple report sending framework that basically does the following things:
It performs a SELECT query, it makes some text-formatted tables based on the results, it sends an e-mail, and it performs an UPDATE query.
This system is a generalization of an older one, in which all of the operations were hard coded. However, in pushing all of the logic of what I'd like to do into the SELECT query, I've run across a problem.
Before, I could get most of the information for my text tables by saying:
SELECT Name, Address FROM Databas.Tabl WHERE Status='URGENT';
Then, when I needed an extra number for the e-mail, also do:
SELECT COUNT(*) FROM Databas.Tabl WHERE Status='URGENT' AND TimeLogged='Noon';
Now, I no longer have the luxury of multiple SELECT queries. What I'd like to do is something like:
SELECT Tabl.Name, Tabl.Address, COUNT(Results.UID) AS Totals
FROM Databas.Tabl
LEFT JOIN Databas.Tabl Results
ON Tabl.UID = Results.UID
AND Results.TimeLogged='Noon'
WHERE Status='URGENT';
This, at least in my head, says to get a total count of all the rows that were SELECTed and also have some conditional.
In reality, though, this gives me the "1140 - Mixing of GROUP columns with no GROUP columns illegal if no GROUP BY" error. The problem is, I don't want to GROUP BY. I want this COUNT to redundantly repeat the number of results that SELECT found whose TimeLogged='Noon'. Or I want to remove the AND clause and include, as a column in the result of the SELECT statement, the number of results that that SELECT statement found.
GROUP BY is not the answer, because that causes it to get the COUNT of only the rows who have the same value in some column. And COUNT might not even be the way to go about this, although it's what comes to mind. FOUND_ROWS() won't do the trick, since it needs to be part of a secondary query, and I only get one (plus there's no LIMIT involved), and ROW_COUNT() doesn't seem to work since it's a SELECT statement.
I may be approaching it from the wrong angle entirely. But what I want to do is get COUNT-type information about the results of a SELECT query, as well as all the other information that the SELECT query returned, in one single query.
=== Here's what I've got so far ===
SELECT Tabl.Name, Tabl.Address, Results.Totals
FROM Databas.Tabl
LEFT JOIN (SELECT COUNT(*) AS Totals, 0 AS Bonus
FROM Databas.Tabl
WHERE TimeLogged='Noon'
GROUP BY NULL) Results
ON 0 = Results.Bonus
WHERE Status='URGENT';
This does use sub-SELECTs, which I was initially hoping to avoid, but now realize that hope may have been foolish. Plus it seems like the COUNTing SELECT sub-queries will be less costly than the main query since the COUNT conditionals are all on one table, but the real SELECT I'm working with has to join on multiple different tables for derived information.
The key realizations are that I can GROUP BY NULL, which will return a single result so that COUNT(*) will actually catch everything, and that I can force a correlation to this column by just faking a Bonus column with 0 on both tables.
It looks like this is the solution I will be using, but I can't actually accept it as an answer until tomorrow. Thanks for all the help.
SELECT Tabl.Name, Tabl.Address, Results.Totals
FROM Databas.Tabl
LEFT JOIN (SELECT COUNT(*) AS Totals, 0 AS Bonus
FROM Databas.Tabl
WHERE TimeLogged='Noon'
GROUP BY NULL) Results
ON 0 = Results.Bonus
WHERE Status='URGENT';
I figured this out thanks to ideas generated by multiple answers, although it's not actually the direct result of any one. Why this does what I need has been explained in the edit of the original post, but I wanted to be able to resolve the question with the proper answer in case anyone else wants to perform this silly kind of operation. Thanks to all who helped.
You could probably do a union instead. You'd have to add a column to the original query and select 0 in it, then UNION that with your second query, which returns a single column. To do that, the second query must also select empty fields to match the first.
SELECT Cnt = 0, Name, Address FROM Databas.Tabl WHERE Status='URGENT'
UNION ALL
SELECT COUNT(*) as Cnt, Name='', Address='' FROM Databas.Tabl WHERE Status='URGENT' AND TimeLogged='Noon';
It's a bit of a hack, but what you're trying to do isn't ideal...
Does this do what you need?
SELECT Tabl.Name ,
Tabl.Address ,
COUNT(Results.UID) AS GrandTotal,
COUNT(CASE WHEN Results.TimeLogged='Noon' THEN 1 END) AS NoonTotal
FROM Databas.Tabl
LEFT JOIN Databas.Tabl Results
ON Tabl.UID = Results.UID
WHERE Status ='URGENT'
GROUP BY Tabl.Name,
Tabl.Address
WITH ROLLUP;
The API you're using to access the database should be able to report to you how many rows were returned - say, if you're running perl, you could do something like this:
my $sth = $dbh->prepare("SELECT Name, Address FROM Databas.Tabl WHERE Status='URGENT'");
my $rv = $sth->execute();
my $rows = $sth->rows;
Grouping by Tabl.id i dont believe would mess up the results. Give it a try and see if thats what you want.