MySQL DISTINCT before GROUP - mysql

I have a MySQL table with 10 fields. For this particular query, I only care about 4: Title, Variables, Location, Date. I would like to take the distinct values of these four groups and then group by Title, Variables. However, When I use the following query
Select DISTINCT
Title,
Variables,
Location,
Date
FROM ForecastsTest2 WHERE ...
GROUP BY Variables, Title
ORDER BY Title
It groups first and then takes distinct results. Is there any way I can switch this order?

I ended up finding my solution in
SELECT Variables,
Title
FROM (SELECT DISTINCT Variables,
Title,
Location,
Date
FROM MyTABLE as Table1
) as Table2
WHERE ...
GROUP BY Variables, Title
ORDER BY TITLE
I guess I didn't do a good job of mentioning this, but I also added a HAVING COUNT(*) >= 2 in the query. In this case, the the count will happen after all non distinct rows have been removed.

Related

MySQL 5.7 How to do GROUP BY with sorting?

Similar to this issue: MySQL 5.7 group by latest record
I'm not sure how to do this properly in 5.7. Also with possibility of 2nd sort column. Working query in 5.6 that I'm trying to replicate in 5.7:
SELECT id FROM test
GROUP BY category
ORDER BY sort1 DESC, sort2 DESC
id is not always the highest, so MAX(id) does not work.
Looking into the link above, the solution for single sort should be:
SELECT t1.*
FROM test t1
INNER JOIN (
SELECT category, max(sort) AS sort FROM test GROUP BY category
) t2 ON t2.category = t1.category AND t2.sort = t1.sort
But how will it work with 2 sorting?
You are using GROUP BY the wrong way.
Think of group by as a way to separate data row into different groups. Each group has multiple rows, based on the value of group by column.
Once you get those groups, selecting table columns (as in: select *) is like picking any row from that group randomly. This is not helpful nor useful.
Usually once we group records (or rows), we need to find meta information about those records. For example: get us the count of records in that group (as in: select count(*)), or the sum of values of a specific column in that group (as in: select sum(price)), or get the min, max or avg values.
So in a nutshell, when you use group by you should use on of the aggregation functions with it, otherwise it's not going to do you any good.
Why don't you have the ORDER BY at your outer query, instead?
SELECT *
FROM (
SELECT 100 AS id, 1 AS category, NULL AS sort
UNION
SELECT 200 AS id, 1 AS category, 2 AS sort
) dt
GROUP BY category
ORDER BY sort DESC;
It seems that what happened to the data when it was grouped, it took the first data while neglecting the ORDER BY DESC. On your first query, it ordered descending first then group by took the first record which is 200. And yes, this shouldn't be the way you should use GROUP BY. It is used in conjunction with aggregate functions.
when you select a column in a group by query that is not one of the columns you are grouping by, (ie, your id) you have no control over the value unless you use another aggregate function. If you want to sort, use MIN or MAX:
SELECT MAX(id), category, FROM `test2`
GROUP BY category; -- always returns 200
SELECT MIN(id), category, FROM `test2`
GROUP BY category; -- always returns 100

SQL find distinct and show other columns

I have read many replies and to similar questions but cannot seem to apply it to my situation. I have a table that averages 10,000 records and is ever changing. It containing a column called deviceID which has about 20 unique values, another called dateAndTime and many others including status1 and status2. I need to isolate one instance each deviceID, showing the record that had the most current dateAndTime. This works great using:
select DISTINCT deviceID, MAX(dateAndTime)
from MyTable
Group By deviceID
ORDER BY MAX(dateAndTime) DESC
(I have noticed omitting DISTINCT from the above statement also yields the same result)
However, I cannot expand this statement to include the fields status fields without incurring errors in the statement or incorrect results. I have tried using IN and EXISTS and syntax to isolate rows, all without luck. I am wondering how I can nest or re-write this query so that the results will display the unique deviceID's, the date of the most recent record and the corresponding status fields associated with those unique records.
If you can guarantee that the DeviceID + DateAndTime is UNIQUE you can do the following:
SELECT *
FROM
MyTable as T1,
(SELECT DeviceID, max(DateAndTime) as mx FROM MyTable group by DeviceID) as T2
WHERE
T1.DeviceID = T2.DeviceID AND
T1.DateAndTime = T2.mx
So basically what happens is, that you do a group by on the DeviceID (NOTE: A GROUP BY always goes with an aggregate function. We are using MAX in this case).
Then you join the Query with the Table, and add the DeviceID + DateAndTime in the WHERE clause.
Side Note... GROUP BY will return distinct elements with or without adding DISTINCT because all rows are distinct by default.
Maybe:
SELECT a.*
FROM( SELECT DISTINCT *,
ROW_NUMBER() OVER (PARTITION BY deviceID ORDER BY dateAndTime DESC) as rown
FROM MyTable ) a
WHERE a.rown = 1

Shortest way to GROUP BY similar values and retain latest rows?

Sometimes I want to get just one row of each similar value, I ussually do somethingl ike this:
SELECT * GROUP BY Text ORDER BY Date DESC
My problem using GROUP to select similar rows is that I don't get the values from the latest rows in the row (I'm not quite sure what's the criteria to choosing the row that stays). I want to retain only the newest row in the group.
I know how to do it with a self join but when statements are already very long it seems a bit complicated. Is there any shorter method? Maybe using DISTINCT instead of GROUP BY?
Assuming you have a table that has multiple columns and two of which are GroupID and DATE. If you want to select the latest record for each GroupID, you need to have a subquery which gets the latest Date for each GroupID, example
SELECT a.* -- this just selects all records from original table
FROM tableName a
INNER JOIN
(
-- this subquery gets the latest DATE entry for each GROUPID
SELECT GroupID, MAX(DATE) maxDate
FROM tableName
GROUP BY GroupID
) b ON a.GroupID = b.GroupID AND
a.Date = b.maxDate
if this answer is not clear, please do ask :D
Did you try to use the max function:
SELECT A,B,max(Date) GROUP BY Text

Mysql query to find distinct rows shows incorrect result

I wish to find the total number of distinct records in a table.
I have a table with the following columns
id, name, product, rating, manufacturer price
This has around 128 rows with some duplicates based on different column names.
I only want to select distinct rows:
select distinct name, product, rating, maufacturer, price from table
This returns 47 rows
For pagination purposes, I need to find the total number of distinct records, so I have another satatement:
select distinct count(name), product, rating, maufacturer, price from table
But this returns 128 instead of 47.
How can I get the total number of distinct rows? Any help will be much appreciated. Thanks
You have the distinct and count reversed.
SELECT COUNT(DISTINCT column_name) FROM table_name
Also, I would drop the extra fields when counting, your results will be unexpected for those other fields.
It is not quite clear if you want to get the count in the SAME query with the results or if you want to run a different query. Here go both solutions. In the result as a new column:
select distinct name, product, rating, manufacturer, price, (
select count(*) from (
select distinct name, product, rating, manufacturer, price from table1
) as resultCount) as resultCount
from table1
Notice the previous solution will repeat the count(*) for each row, which is not very efficient, not even visually appealing. Try running two queries one getting the actual data and the other one to get the amount of records in the table that match that data:
select distinct name, product, rating, manufacturer, price from table1
select count(*) from (
select distinct name, product, rating, manufacturer, price from table1
) as result
Hope this helps
Try adding GROUP BY name, product, rating, maufacturer, price clause
It would require running your actual query TWICE... an INNER for distinct and then get the count of those as a single row returned, and then join that to the original select distinct...
select distinct
t1.product,
t1.rating,
t1.maufacturer,
t1.price,
JustTheCount.DistCnt
from
table t1,
( select count(*) as DistCnt
from ( select distinct
t2.product,
t2.rating,
t2.maufacturer,
t2.price
from
table t2 )
) JustTheCount
In the following query, you're getting rows with distinct names since the DISTINCT clause precedes the name column:
SELECT DISTINCT name, product, rating, maufacturer, price FROM table
However, to get the count of the same records, use the following format:
SELECT COUNT(DISTINCT name) FROM table
Notice that DISTINCT goes inside of the COUNT function so that you're counting the distinct names. You probably don't want to include the other columns in the count query because they will be a random sample from the set. Of course, if you want a random sample, then include them.
Most applications will run the count query first, followed by the query to return the results. Also keep in mind that COUNT(*) is only an estimate, and the value may differ from the actual number of records returned.
SELECT DISTINCT COUNT(name), product FROM table isn't even a valid query in MySQL 4.x. You can't mix aggregate and non-aggregate columns. IN 5.x, it'll run, but the values for the non aggregate columns will be a random sample from the set.
At the risk of sparking some flames here.. you could always use:
SQL_CALC_FOUND_ROWS as the first part of your SQL. This is very mysql specific though.
http://dev.mysql.com/doc/refman/5.0/en/information-functions.html
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();

Mysql select distinct

I am trying to select of the duplicate rows in mysql table it's working fine for me but the problem is that it is not letting me select all the fields in that query , just letting me select the field name i used as distinct , lemme write the query for better understading
mysql_query("SELECT DISTINCT ticket_id FROM temp_tickets ORDER BY ticket_id")
mysql_query("SELECT * , DISTINCT ticket_id FROM temp_tickets ORDER BY ticket_id")
1st one is working fine
now when i am trying to select all fields i am ending up with errors
i am trying to select the latest of the duplicates let say ticket_id 127 is 3 times on row id 7,8,9 so i want to select it once with the latest entry that would be 9 in this case and this applies on all the rest of the ticket_id's
Any idea
thanks
DISTINCT is not a function that applies only to some columns. It's a query modifier that applies to all columns in the select-list.
That is, DISTINCT reduces rows only if all columns are identical to the columns of another row.
DISTINCT must follow immediately after SELECT (along with other query modifiers, like SQL_CALC_FOUND_ROWS). Then following the query modifiers, you can list columns.
RIGHT: SELECT DISTINCT foo, ticket_id FROM table...
Output a row for each distinct pairing of values across ticket_id and foo.
WRONG: SELECT foo, DISTINCT ticket_id FROM table...
If there are three distinct values of ticket_id, would this return only three rows? What if there are six distinct values of foo? Which three values of the six possible values of foo should be output?
It's ambiguous as written.
Are you looking for "SELECT * FROM temp_tickets GROUP BY ticket_id ORDER BY ticket_id ?
UPDATE
SELECT t.*
FROM
(SELECT ticket_id, MAX(id) as id FROM temp_tickets GROUP BY ticket_id) a
INNER JOIN temp_tickets t ON (t.id = a.id)
You can use group by instead of distinct. Because when you use distinct, you'll get struggle to select all values from table. Unlike when you use group by, you can get distinct values and also all fields in table.
You can use DISTINCT like that
mysql_query("SELECT DISTINCT(ticket_id), column1, column2, column3
FROM temp_tickets
ORDER BY ticket_id");
use a subselect:
http://forums.asp.net/t/1470093.aspx