Below is the data in my table:
TABLE:
abc-ac
abc-dc
aax-i
bcs-o-dc
ddd-o-poe-dc
I need to write a query which will display only the unique entries as a result:
abc-ac
aax-i
bcs-o-dc
ddd-o-poe-dc
So basically, since the first two entries start with "abc", it should be treated as one and displayed.
Thanks.
If you're not picky about which one of the two abc-* records that it shows you can use this:
SELECT f1 FROM mytable GROUP BY substring_index(f1, '-', 1)
SQLFiddle Here
That substring_index() function will split the value in your field by - and return the first bit. So essentially your records get grouped by only the first part. This is one of the few times that we can take advantage of MySQLs strange GROUP BY behavior where it will allow you to leave out non-aggregated fields from the group by.
Related
I have duplicate records in one sql table . The rows has the same id but different values in different fields. how I can combine or merge those two rows or more into one row. Please help,
You can group those rows on certain fields by specifying a GROUP BY clause.
In your case you would group on the ID column. For the columns you select in the SELECT clause that are not specified in the GROUP BY clause (i.e. columns other than ID), you will have to apply an aggregate function (e.g. SUM, MAX, MIN, ...).
Edit - Simplified example based on your image:
SELECT
MasterID,
CUSTNAME=MIN(CUSTNAME),
ER1=MIN(ER1),
ER1_BU=MIN(ER1_BU)
-- For the other fields, the idea is the same
FROM
your_table
GROUP BY
MasterID
ORDER BY
MasterID;
This example takes for each field the minimum of the fields for a particular MasterID. You did not really define what you mean by "merge". Perhaps you want the result to be a particular merge, you will have to clarify further if this example doesn't "merge" the rows like you want.
I have a table :
CREATE TABLE data
(
value integer,
name varchar(100)
)
In my table there can be duplicate values of name possible with different value of value. Now I want to get DISTINCT name and there avg() value from the Table data.
I am able to get DISTINCT value of name but unable to get avg() of there values.
Now with following Query I get avg() of all data :
select floor(avg(value)) from data
I know this is incorrect but I am new to SQL. I want this select floor(avg(value)) for distinct values of name.
Data :
insert into data values(10, 'mnciitbhu')
insert into data values(20, 'mnciitbhu')
insert into data values(40, 'mafiya69')
insert into data values(20, 'mafiya69')
insert into data values(0, 'mafiya69')
Output :
mnciitbhu 15
mafiya69 20
Adding this because the other answers while accurate, are not detailed.
What you want to do here, are use the grouping and aggregation features of SQL.
grouping your results by particular fields, will divide your result set into discrete sections, which you can operate on with aggregate functions, to get averages, sums, counts etc, per group.
For a full list of aggregate functions, and other miscellaneous information about group by, you can read 12.16.1 GROUP BY (Aggregate) Functions.
In your instance, since you want the average per name, you will need to group by name. This would give the following query:
select name, avg(value)
from `data`
group by name; -- this is the important line
And this query will calculate the average of value, for each group of names in your table, returning one row per group.
One very important consideration when using group by, is that all fields contained in the select, must either be contained in the group by clause, or used in aggregate functions. If you refer to a field that isn't covered by this, you may end up with undesired indeterminate results.
From the manual 12.16.3 MySQL Handling of GROUP BY
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group. The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.
The importance of that paragraph cannot be overstated. It is very easy to mis-understand how this works, arrive at a query that seems to give the desired result, but will occasionally give incorrect/undesired data.
Use this code:
select name,AVG(value) as Average from data
group by name
order by name desc
OUTPUT:
name Average
mnciitbhu 15
mafiya69 20
Try this
select name,avg(value) from data group by name
I have a table with "unique" values. The problem is that the program, which adds these values also adds 3 different postfixs to the value (2 characters in the end of the value). As a result, I have three variable with three postfixs. So i need get only unique values from bd - somehow sort it out without the last two characters. Are any ideas?
What Camera_id should you return (first,last,maximum,minimum???) if rows have one "unique" value but different Camera_id's. Try something like this:
select
LEFT(camera_name,LENGTH(camera_name)-2), max(camera_id)
from cameras
where site_id=1
group by LEFT(camera_name,LENGTH(camera_name)-2)
Do you want to retrieve the values with the first letter only?
SELECT DISTINCT SUBSTRING(ColumnName, 1,1) a
FROM tablename
ORDER BY a
can you show sample records? it helps a lot when your asking question.
I have a table named forms with the following structure-
GROUP | FORM | FILEPATH
====================================
SomeGroup | SomeForm1 | SomePath1
SomeGroup | SomeForm2 | SomePath2
------------------------------------
I use the following query-
SELECT * FROM forms GROUP BY 'GROUP'
It returns only the first row-
GROUP | FORM | FILEPATH
====================================
SomeGroup | SomeForm1 | SomePath1
------------------------------------
Shouldn't it return both (or all of it)? Or am I (possibly) wrong?
As the manual states:
In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY:
SELECT o.custid, c.name, MAX(o.payment)
FROM orders AS o, customers AS c
WHERE o.custid = c.custid
GROUP BY o.custid;
For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.
In your case, MySQL is correctly performing the grouping operation, but (since you select all columns including those by which you are not grouping the query) gives you an indeterminate one record from each group.
It only returns one row, because the values of your GROUP column are the same ... that's basically how GROUP BY works.
Btw, when using GROUP BY it's good form to use aggregate functions for the other columns, such as COUNT(), MIN(), MAX(). In MySQL it usually returns the first row of each group if you just specify the column names; other databases will not like that though.
Your code:
SELECT * FROM forms GROUP BY 'GROUP'
isn't very "good" SQL, MySQL lets you get away with it and returns only one value for all columns not mentioned in the group by clause. Almost any other database would not perform this query. As a rule, any column, that is not part of the grouping condition must be used with an aggregate function.
as far as mysql is concerned, I just solved my problem by hit & trial.
I had the same problem 10 minutes ago. I was using mysql statement something like this:
SELECT * FROM forms GROUP BY 'ID'; // returns only one row
However using the statement like the following would yeild same result:
SELECT ID FROM forms GROUP BY 'ID'; // returns only one row
The following was my solution:
SELECT ID FROM forms GROUP BY ID; // returns more than one row (with one column of field "ID") grouped by ID
or
SELECT * FROM forms GROUP BY ID; // returns more than one row (with columns of all fields) grouped by ID
or
SELECT * FROM forms GROUP BY `ID`; // returns more than one row (with columns of all fields) grouped by ID
Lesson: Donot use semicolon, i believe it does a stringtype search with colons. Remove colons from column name and it will group by its value. However you can use backtick escapes eg. ID
Thank you everyone for pointing out the obvious mistake I was too blind to see. I finally replaced GROUP BY with ORDER BY and included a WHERE clause to get my desired result. That is what I was intending to use all along. Silly me.
My final query becomes this-
SELECT * FROM forms WHERE GROUP='SomeGroup' ORDER BY 'GROUP'
SELECT * FROM forms GROUP BY `GROUP`
it's strange that your query does work
The above result is kind of correct, but not quite.
All columns you select, which are not part of the GROUP BY statement have to be aggregated by some function (list of aggregation function from the MySQL docu). Most often they are used together with numeric columns.
Besides this, your query will return one output row for every (combination of) attributes in the columns referenced in the GROUP BY statement. In your case there is just one distinct value in the GROUP column, namely "SomeGroup", so the output will only contain one row for this value.
Group by clause should only be required if you have any group functions, say max, min, avg, sum, etc, applied in query expressions. Your query does not show any such functions. Meaning you actually not required a Group by clause. And if you still use such clause, you will receive only the first record from a grouped results.
Hence output on your query is perfect.
Query result is perfect; it will return only one row.
I want to get the distinct value of a particular column however duplicity is not properly managed if more than 3 columns are selected.
The query is:
SELECT DISTINCT
ShoppingSessionId, userid
FROM
dbo.tbl_ShoppingCart
GROUP BY
ShoppingSessionId, userid
HAVING
userid = 7
This query produces correct result, but if we add another column then result is wrong.
Please help me as I want to use the ShoppingSessionId as a distinct, except when I want to use all the columns from the table, including with the where clause .
How can I do that?
The DISTINCT keyword applies to the entire row, never to a column.
Presently DISTINCT is not needed at all, because your script already makes sure that ShoppingSession is distinct: by specifying the column in GROUP BY and filtering on the other grouping column (userid).
When you add a third column to GROUP BY and it results in duplicated ShoppingSession, it means that some ShoppingSession values are associated with many different values of the added column.
If you want ShoppingSession to remain distinct after including that third column, you should decide which values of the the added column should be left in the output and which should be discarded. This is called aggregating. You could apply the MAX() function to that column, or MIN() or any other suitable aggregate function. Note that the column should not be included in GROUP BY in this case.
Here's an illustration of what I'm talking about:
SELECT
ShoppingSessionId,
userid,
MAX(YourThirdColumn) AS YourThirdColumn
FROM dbo.tbl_ShoppingCart
GROUP BY
ShoppingSessionId,
userid
HAVING userid = 7
There's one more note on your query. The HAVING clause is typically used for filtering on aggregated columns. If your filter does not involve aggregated columns, you'll be better off using the WHERE clause instead:
SELECT
ShoppingSessionId,
userid,
MAX(YourThirdColumn) AS YourThirdColumn
FROM dbo.tbl_ShoppingCart
WHERE userid = 7
GROUP BY
ShoppingSessionId,
userid
Although both queries would produce identical results, their efficiency would be different, because the first query would have to pull all rows, group/aggregate them, then discard all rows except userid = 7, but the second one would discard rows first and only then group/aggregate the remaining, which is much more efficient.
You could go even further and exclude the userid column from GROUP BY and pull its value with an aggregate function:
SELECT
ShoppingSessionId,
MAX(userid) AS userid,
MAX(YourThirdColumn) AS YourThirdColumn
FROM dbo.tbl_ShoppingCart
WHERE userid = 7
GROUP BY
ShoppingSessionId
Since all userid values in your output are supposed to contain 7 (because that's in your filter), you can just pick a maximum value per every ShoppingSession, knowing that it'll always be 7.