I have a table has name column and labels. Some names have several label (several names mean several row) so I need a query to got the names who have several label.
Could you help me please?
O.
Table:
+----+--------+-------+
| id | Name | Label |
+----+--------+-------+
| 1 | Juan | 10 |
| 2 | Joli | 11 |
| 3 | Sali | 12 |
| 4 | Juan | 15 |
| 5 | Odette | 13 |
| 6 | Sali | 18 |
| 7 | Sali | 17 |
| 8 | Youri | 14 |
+----+--------+-------+
Expected result:
+--------+-------+
| Name | Label |
+--------+-------+
| Juan | 10 |
| Juan | 15 |
| Sali | 12 |
| Sali | 18 |
| Sali | 17 |
+--------+-------+
Try this query.
SELECT name, label
FROM table2
WHERE name
IN (SELECT name
FROM table2
GROUP BY name
HAVING COUNT(name) > 1)
ORDER BY name ASC
Related
Let's say i have a user table like this :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 3 | Joseph | joseph410#mail.com | 21 |
| 4 | George | gge.48#mail.com | 28 |
| 5 | Joseph | jh.city89#mail.com | 24 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 8 | Joseph | psa.jos#mail.com | 34 |
| 9 | Joseph | joseph.la#mail.com | 28 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
In the actual, the database consists of more data and some of the data is duplicated, with more than two records. But the point is i want to get only the first and the second row of the duplicated rows that contains the name of "Joseph", How can i achieve this ? My code so far...
User::withTrashed()->groupBy('name')->havingRaw('count("name") >= 1')->get();
With that code the result will retrieve :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 3 | Joseph | joseph410#mail.com | 21 |
| 4 | George | gge.48#mail.com | 28 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
And i use this code to try to get the second duplicated row :
User::withTrashed()->groupBy('name')->havingRaw('count("name") >= 2')->get();
The result is still same as the mentioned above :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 3 | Joseph | joseph410#mail.com | 21 |
| 4 | George | gge.48#mail.com | 28 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
I want the result is to get record that have the id "5" with name "Joseph" like this :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 4 | George | gge.48#mail.com | 28 |
| 5 | Joseph | jh.city89#mail.com | 24 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
But it seems only the first duplicate row is retrieved and i can't get the second duplicated row, can anybody give me suggestion ?
Let's start from your query
User::withTrashed()->groupBy('name')->havingRaw('count("name") >= 1')->get();
This will show all groups of rows whose count equals to 1 ore more. and this is the description of DISTINCT.
If you want to get only duplicate records you should get groups whose count is LARGER than 1.
The other thing to notice here is that a non-aggrigated column will be chosen randomly. because when you get a name and it's count, for example if you select name,count(name), email (email is not in the group by clause - not aggregated), and 4 rows have the same name. so you'll see:
+--------+-------------+-------+
| Name | Count(Name) | Email |
+--------+-------------+-------+
| Joseph | 4 | X |
+--------+-------------+-------+
what do you expect instead of X? which one of the 4 emails? actually, in SQLServer it's forbidden to select a non-aggrigated column and other databases will just give you a random one of the counted 3.
see this answer for more details it's explained very well: Do all columns in a SELECT list have to appear in a GROUP BY clause
So, we'll use having count(name) > 1 and select only the aggregated column name
DB::from('users')->select('name')->groupBy('name')->havingRaw('count("name") > 1')->get();
This should give you (didn't test it) this:
+--------+-------------+
| name | Count(name) |
+--------+-------------+
| Joseph | 4 |
+--------+-------------+
This will give you all names who have 2 or more instances. you can determine the number of duplicates in the having clause. for example having count(name) = 3 will give you all names which have exactly 3 duplicates.
So how to get the second duplicate? I have a question for that:
What is the first (original) duplicate? is it the one with the oldest created_at or the oldest updated_at ? or maybe some other condition?. because of that you should make another query with order by clause to give you the duplicates in the order most convenient to you. for example:
select * from `users` where `name` in (select `name` from users group by `name` having count(`name`) > 1) order by `id` asc
which will give:
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 3 | Joseph | joseph410#mail.com | 21 |
| 5 | Joseph | jh.city89#mail.com | 24 |
| 8 | Joseph | psa.jos#mail.com | 34 |
| 9 | Joseph | joseph.la#mail.com | 28 |
+----+-----------+---------+------------+------+
I have a query that returns a table that looks something like this:
+------+----------+-------+------+----+
| pID | name | month | q | s |
+------+----------+-------+------+----+
| 1468 | bob | 2 | 1 | 14 |
| 1469 | bob | 2 | 1 | 2 |
| 1470 | bob | 2 | 1 | 9 |
| 1468 | bob | 3 | 1 | 7 |
| 1469 | bob | 3 | 1 | 8 |
| 1470 | bob | 3 | 1 | 11 |
+------+----------+-------+------+----+
and I would like the output to be
+----------+-------+------+-----+
| name | month | q | sub |
+----------+-------+------+-----+
| bob | 2 | 1 | 25 |
| bob | 3 | 1 | 26 |
+----------+-------+------+-----+
Essentially, I want the first two columns of my output to be name, month and q grouped by name and month (they will always have the same data per line in this grouping) and I want the last column to be the SUM of s grouped by name only.
Thanks.
It should be something like this:
SELECT name, month, q, SUM(sub)
FROM table
GROUP BY name, month, q
Here is my table:
+----+-------+------+
| id | name | code |
+----+-------+------+
| 1 | jack | 1 |
| 2 | peter | 1 |
| 3 | jack | 1 |
| 4 | ali | 2 |
| 5 | peter | 3 |
| 6 | peter | 1 |
| 7 | ali | 2 |
| 8 | jack | 3 |
| 9 | peter | 2 |
| 10 | peter | 4 |
+----+-------+------+
I want to select all rows that satisfy: the number of {those rows which its code value is between 1-3 and its name vale is identical} be more or equal than 4
From the above data, I want this output:
+----+-------+------+
| id | name | code |
+----+-------+------+
| 2 | peter | 1 |
| 5 | peter | 3 |
| 6 | peter | 1 |
| 9 | peter | 2 |
+----+-------+------+
How can I do that?
Use a subquery to figure out which names should be returned, then build your main query on that.
Try this:
select *
from mytable
where name in (
select name
from mytable
where code between 1 and 3
group by name
having count(*) > 3)
and code between 1 and 3
I'm trying to get the following result in mysql with below table: I
have tried to use sum (case when) but it give me the result for an
individual name only ,I want in the report the the total cost for each
user.
note: the ID column is not important you can ignore it .
mysql> select * from calls_records;
+----+-------+------------+------+
| id | month | name | cost |
+----+-------+------------+------+
| 1 | 1 | osama | 40 |
| 2 | 1 | rahman | 40 |
| 3 | 1 | ahmed | 30 |
| 4 | 1 | ali albann | 10.5 |
| 5 | 2 | osama | 10 |
| 6 | 2 | ali albann | 30 |
| 7 | 2 | ahmed | 10 |
| 8 | 2 | rahman | 10 |
+----+-------+------------+------+
expected result
+-----------+---------------------------+
| name | total_cost_for_each_user |
+------------+---------------------------+
| ahmed | 50 |
| ali albann | 40.5 |
| osama | 50 |
| rahman | 50 |
+------------+---------------------------+
query
select name, sum(cost) as totalcost
from calls_records
group by name
;
output
+------------+-----------+
| name | totalcost |
+------------+-----------+
| ahmed | 40 |
| ali albann | 40.5 |
| osama | 50 |
| rahman | 50 |
+------------+-----------+
sqlfiddle
This seems like such a simple problem, but I can't find a good solution. I'm trying to select information from a slightly misformatted table. Basically, wherever sequence=0, the person_id should actually be a company_id. This company_id then applies to all the rows which have the same group_id.
Someone thought it was a good idea to format things this way instead of simply having a company_id column, but it makes trying to select by company very difficult. It would make my programming much easier to simply add this extra column, and fix the formatting.
I want to turn something like this:
+----------+------------+-----------+----------+
| group_id | date | person_id | sequence |
+----------+------------+-----------+----------+
| 1 | 2012-08-31 | 10 | 0 |
| 1 | 2012-08-31 | 11 | 1 |
| 1 | 2012-08-31 | 12 | 2 |
| 2 | 1999-04-16 | 10 | 0 |
| 2 | 1999-04-16 | 21 | 1 |
| 2 | 1999-04-16 | 22 | 2 |
| 2 | 1999-04-16 | 23 | 3 |
| 2 | 1999-04-16 | 24 | 4 |
| 3 | 2001-01-09 | 30 | 0 |
| 3 | 2001-01-09 | 31 | 1 |
| 3 | 2001-01-09 | 11 | 2 |
| 3 | 2001-01-09 | 12 | 3 |
+----------+------------+-----------+----------+
Into this:
+------------+----------+------------+-----------+----------+
| company_id | group_id | date | person_id | sequence |
+------------+----------+------------+-----------+----------+
| 10 | 1 | 2012-08-31 | 11 | 1 |
| 10 | 1 | 2012-08-31 | 12 | 2 |
| 10 | 2 | 1999-04-16 | 21 | 1 |
| 10 | 2 | 1999-04-16 | 22 | 2 |
| 10 | 2 | 1999-04-16 | 23 | 3 |
| 10 | 2 | 1999-04-16 | 24 | 4 |
| 30 | 3 | 2001-01-09 | 31 | 1 |
| 30 | 3 | 2001-01-09 | 11 | 2 |
| 30 | 3 | 2001-01-09 | 12 | 3 |
+------------+----------+------------+-----------+----------+
The only way I can think of how to achieve this is with nested SELECT statements, which are very inefficient considering I have about 100M rows. It's a one time fix though, so I don't mind letting it run overnight.
If you permanently want to change your table to include a company_id column then do this:
First alter the table and add the new column:
alter table your_table add company_id int;
Then update all rows to set the company to the person_id = 0 for the group:
UPDATE your_table a
JOIN your_table b ON a.group_id = b.group_id
SET a.company_id = b.person_id
WHERE b.sequence = 0;
And finally remove the rows with sequence = 0:
DELETE FROM your_table WHERE sequence = 0;
Sample SQL Fiddle
The end result will be:
| group_id | date | person_id | sequence | company_id |
|----------|------------|-----------|----------|------------|
| 1 | 2012-08-31 | 11 | 1 | 10 |
| 1 | 2012-08-31 | 12 | 2 | 10 |
| 2 | 1999-04-16 | 21 | 1 | 10 |
| 2 | 1999-04-16 | 22 | 2 | 10 |
| 2 | 1999-04-16 | 23 | 3 | 10 |
| 2 | 1999-04-16 | 24 | 4 | 10 |
| 3 | 2001-01-09 | 31 | 1 | 30 |
| 3 | 2001-01-09 | 11 | 2 | 30 |
| 3 | 2001-01-09 | 12 | 3 | 30 |