What is happening when using DISTINCT? - mysql

Here is my table and the data contained in it:
Table: first
+----------+------+
| first_id | data |
+----------+------+
| 1 | 5 |
| 2 | 6 |
| 3 | 7 |
| 4 | 6 |
| 5 | 7 |
| 6 | 5 |
| 7 | 7 |
| 8 | 6 |
| 9 | 5 |
| 10 | 7 |
+----------+------+
Table: second
+-----------+----------+----------+
| second_id | first_id | third_id |
+-----------+----------+----------+
| 1 | 1 | 2 |
| 2 | 2 | 3 |
| 3 | 3 | 4 |
| 4 | 4 | 2 |
| 5 | 5 | 3 |
| 6 | 6 | 4 |
| 7 | 7 | 2 |
| 8 | 8 | 2 |
| 9 | 9 | 4 |
| 10 | 10 | 4 |
+-----------+----------+----------+
My intention is to get the list of third_ids ordered by data field. Now, I ran the following query for that.
SELECT
third_id, data
FROM
first f JOIN second s ON ( s.first_id = f.first_id )
ORDER BY
data ASC;
And I get the following result as expected.
+----------+------+
| third_id | data |
+----------+------+
| 4 | 5 |
| 2 | 5 |
| 4 | 5 |
| 2 | 6 |
| 3 | 6 |
| 2 | 6 |
| 2 | 7 |
| 4 | 7 |
| 4 | 7 |
| 3 | 7 |
+----------+------+
The following query is also work as expected.
SELECT
third_id
FROM
first f JOIN second s ON ( s.first_id = f.first_id )
ORDER BY
data ASC;
with output
+----------+
| third_id |
+----------+
| 4 |
| 2 |
| 4 |
| 2 |
| 3 |
| 2 |
| 2 |
| 4 |
| 4 |
| 3 |
+----------+
Then I ran the following.
SELECT DISTINCT
third_id
FROM
first f JOIN second s ON ( s.first_id = f.first_id )
ORDER BY
data ASC;
But, here I get an unexpected result:
+----------+
| third_id |
+----------+
| 2 |
| 3 |
| 4 |
+----------+
Here, 3 must be after 2 and 4, since I am ordering on the data field. What am I doing wrong? Or do I have to go for a different strategy.
Note:
This scenario happens on my project. The tables provided here doesn't belong to original database. It is created by me to explain the problem. Original tables contain thousands of rows.
I am inserting database dump if you would like to experiment with the data:
--
-- Table structure for table `first`
--
CREATE TABLE IF NOT EXISTS `first` (
`first_id` int(11) NOT NULL AUTO_INCREMENT,
`data` int(11) NOT NULL,
PRIMARY KEY (`first_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=11 ;
--
-- Dumping data for table `first`
--
INSERT INTO `first` (`first_id`, `data`) VALUES
(1, 5),
(2, 6),
(3, 7),
(4, 6),
(5, 7),
(6, 5),
(7, 7),
(8, 6),
(9, 5),
(10, 7);
--
-- Table structure for table `second`
--
CREATE TABLE IF NOT EXISTS `second` (
`second_id` int(11) NOT NULL AUTO_INCREMENT,
`first_id` int(11) NOT NULL,
`third_id` int(11) NOT NULL,
PRIMARY KEY (`second_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=11 ;
--
-- Dumping data for table `second`
--
INSERT INTO `second` (`second_id`, `first_id`, `third_id`) VALUES
(1, 1, 2),
(2, 2, 3),
(3, 3, 4),
(4, 4, 2),
(5, 5, 3),
(6, 6, 4),
(7, 7, 2),
(8, 8, 2),
(9, 9, 4),
(10, 10, 4);

You probably want to do something like
SELECT third_id
FROM first JOIN second USING (first_id)
GROUP BY third_id
ORDER BY aggregatesomething(data)
that is min(data) or max(data) or whatever.

Doing a SELECT DISTINCT requires the database to order the values in the column(s) as that is the most efficient way to find the distinct values. As far as I'm aware ORDER BY clauses that do not contain columns that are outputted in the query do not get honoured (SQL SERVER won't accept the query) as it is not clear what it would mean to order by something that did not participate.

You may use a subquery -
SELECT DISTINCT third_id FROM (
SELECT
third_id
FROM
first f JOIN second s ON ( s.first_id = f.first_id )
ORDER BY
data ASC
) t;
It will help to select and sort all data firstly, then to select distinct values.

I had this exact problem before. I finally came up with a simple solution, almost seems too simple. You need to use a subquery as a column of the select query. In that subquery is where you will do the ordering by date. When you do it all in a single query with ORDER BY happens before the JOIN. You want to order first, so go with the subquery. http://nathansnoggin.blogspot.com/2009/04/select-distinct-with-order-by.html

Related

How to create MySQL histogram type buckets based on column value

I have the following table and I am trying to create histogram style buckets through MySQL.
| Id | Values |
| -------- | --------- |
| 1 | 5 |
| 2 | 7 |
| 3 | 9 |
| 4 | 11 |
| 5 | 15 |
| 6 | 31 |
| 7 | 32 |
| 8 | 43 |
What I am trying to achieve is as following:
| bucket | count |
| -------- | --------- |
| 0-9 | 3 |
| 10-19 | 2 |
| 20-29 | 0 |
| 30-39 | 2 |
| 40-49 | 1 |
Does anyone know how we can get this in a clean way?
One possible way is to create a reference table for the bucket list then LEFT JOIN it with your table. Try the following steps.
Create a table bucket_list for example:
CREATE TABLE bucket_list (
id INT NOT NULL AUTO_INCREMENT,
startno INT,
endno INT,
PRIMARY KEY(id));
Insert values into bucket_list:
INSERT INTO bucket_list (startno, endno)
VALUES
(0, 9),
(10, 19),
(20, 29),
(30, 39),
(40, 49),
(50, 59),
(60, 69),
(70, 79);
Create a query to return expected result:
SELECT CONCAT(a.startno,'-',a.endno) AS bucket,
SUM(CASE WHEN b.val IS NULL THEN 0 ELSE 1 END) AS COUNT
FROM bucket_list a
LEFT JOIN mytable b ON b.val BETWEEN a.startno AND a.endno
GROUP BY bucket;
Here's a fiddle: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=7fee426efa2b1f1e39377bb7beb68b62

MySQL get lowest 3 values by user

I have a MySQL database and I want to SUM the lowest 3 Point by a Person.
+--------+--------+
| Person | Points |
+--------+--------+
| 1 | 15 |
| 1 | 10 |
| 1 | 5 |
| 1 | 10 |
| 2 | 5 |
| 2 | 4 |
| 2 | 3 |
| 2 | 2 |
| 3 | 1 |
| 3 | 1 |
| 3 | 1 |
+--------+--------+
The result what I want:
+-------+-----+
| 1 | 25 |
| 2 | 9 |
| 3 | 3 |
+-------+-----+
But I am really lost how to solve this. This is my Query Until now, :
SELECT person, SUM(points) FROM (SELECT SUM(points) FROM table
GROUP BY person ORDER BY points ASC LIMIT 3)
This is my SQL Create Script:
CREATE TABLE `mytable` (
`person` int(11) DEFAULT NULL,
`points` int(11) DEFAULT NULL
) ;
INSERT INTO `mytable` (`person`, `points`) VALUES
(1, 15),
(1, 10),
(1, 5),
(1, 10),
(2, 5),
(2, 4),
(2, 3),
(2, 2),
(3, 1),
(3, 1),
(3, 1);
Use a row number in the inner query as a helper to limit the rows.Row number and not limit to avoid the duplicate values problem.Not tested but it should work,too lazy to create a fiddle.
SELECT Person,SUM(Points) FROM
(SELECT Person,Points,
CASE Person
WHEN #person THEN #rank:= #rank+ 1
ELSE #rank:= 1
END AS rank,
#person:= person
FROM t, (SELECT #rank:= 0,#person:='') x
ORDER BY Points ASC)y
WHERE y.rank<=3
GROUP BY Persons
FIDDLE

MySQL subquery/complex query question, tracking state changes

I have a table that tracks contact class state changes by date. The question that I am trying to answer is what is the current state of all contacts on a certain date.
DROP TABLE IF EXISTS `contact_class_state`;
CREATE TABLE `contact_class_state` (
`id` int unsigned NOT NULL AUTO_INCREMENT,
`contact_id` int unsigned DEFAULT NULL, -- the contact
`contact_class` int unsigned,
`state_date` date,
PRIMARY KEY (`id`),
INDEX (`contact_id`)
) DEFAULT CHARSET=utf8;
INSERT INTO `contact_class_state` (`contact_id`, `contact_class`, `state_date`) VALUES
(1, 1, '2011-01-01'),
(2, 1, '2011-01-01'),
(3, 1, '2011-01-01'),
(4, 1, '2011-01-01'),
(5, 1, '2011-01-01'),
(1, 2, '2011-02-01'),
(3, 2, '2011-02-01'),
(5, 2, '2011-02-01'),
(1, 1, '2011-02-15'),
(5, 3, '2011-03-01');
For example, the following query:
SELECT contact_id, contact_class, state_date
FROM contact_class_state
WHERE state_date <= '2011-02-27'
ORDER BY contact_id, state_date DESC
returns
+------------+---------------+------------+
| contact_id | contact_class | state_date |
+------------+---------------+------------+
| 1 | 1 | 2011-02-15 |
| 1 | 2 | 2011-02-01 |
| 1 | 1 | 2011-01-01 |
| 2 | 1 | 2011-01-01 |
| 3 | 2 | 2011-02-01 |
| 3 | 1 | 2011-01-01 |
| 4 | 1 | 2011-01-01 |
| 5 | 2 | 2011-02-01 |
| 5 | 1 | 2011-01-01 |
+------------+---------------+------------+
While this is technically correct, I only need the first (or last if sorted ASC) row for each contact_id as the latest date will always give me current state of the contact, per the below:
+------------+---------------+------------+
| contact_id | contact_class | state_date |
+------------+---------------+------------+
| 1 | 1 | 2011-02-15 |
| 2 | 1 | 2011-01-01 |
| 3 | 2 | 2011-02-01 |
| 4 | 1 | 2011-01-01 |
| 5 | 2 | 2011-02-01 |
+------------+---------------+------------+
I am pretty sure a sub or a complex query would do the trick but I am having a mental block with the SQL. I am also open to other approaches to solve this issue.
Thanks!
If your query is in fact what you want (except then grouped by contact_id), then do exactly that.
SELECT * FROM
(SELECT contact_id, contact_class, state_date
FROM contact_class_state
WHERE state_date <= '2011-02-27'
ORDER BY contact_id, state_date DESC) table1
GROUP BY contact_id
This is tested and works perfect.

Need help with a complex MySQL query

I'm struggling to get a final result set for a 3 table hierarchical set of data. Hopefully, the diagrams will indicate what I have and what I'm trying to do. Briefly, my final result set (below) should easily allow me to define a dynamic number of checkboxes in my web site, while also allowing me to define whether the boxes are checked, all from within a single result set. I believe that since the data is normalized, I should be able to get a single result set, but I can't get my head wrapped around this on... Can anyone help??
TABLE A TABLE B TABLE C
MEMBER CONTACT ALERT
(pk)$member_id -> (pk)$contact_id -> (pk)$alert_id
(fk)$member_id (fk)$contact_id
$alert_type ->
-> 'local', 'state', 'nation'
Example of my filter criteria is member_id = 1 AND alert_type = 'local'
* = results of filter member_id = 1
TABLE MEMBERS A
+----------+----------+
|member_id | Name |
+----------+----------+
| 1 | Alan | *
| 2 | Brad |
| 3 | Doug |
| 4 | Flo |
+---------------------+
TABLE CONTACTS B
+--------------------------------------------------------------------+
| contact_id | member_id | email | phone | Name |
+------------+-------------+---------------+--------------+----------+
| 1 | 1 | a#gmail.com | | Alex | *
| 2 | 1 | b#gmail.com | 123-456-7890 | Bob | *
| 3 | 3 | c#gmail.com | | Cris |
| 4 | 1 | d#gmail.com | | Dan | *
| 5 | 2 | e#gmail.com | | Ed |
| 6 | 1 | f#gmail.com | | Fran | *
| 7 | 1 | g#gmail.com | 212-323-1111 | Greg | *
| 8 | 2 | h#gmail.com | | Hans |
| 9 | 3 | i#gmail.com | | Ida |
| 10 | 1 | j#gmail.com | 945-555-1212 | Jeff | *
| 11 | 2 | k#gmail.com | 945-555-1212 | Karl |
| 12 | 3 | l#gmail.com | | Leo |
+--------------------------------------------------------------------+
# = resutls of filter alert_type = 'local'
TABLE CONTACTS_SELECTED C
+-----------------------------------------+
| alert_id | contact_id | alert_type |
+------------+------------+---------------+
| 1 | 1 | local | * #
| 2 | 1 | state | *
| 3 | 3 | state |
| 4 | 5 | local |
| 5 | 5 | state |
| 6 | 6 | nation | *
| 7 | 7 | local | * #
| 8 | 8 | nation |
| 9 | 10 | local | *
| 10 | 12 | state |
+-------------------------+---------------+
REQUIRED OUTPUT
+------------------------------------------------------------------------------------+
|member_id | contract_id | email | phone | Name | alert_type |
+----------+--------------+---------------+--------------+----------+----------------+
| 1 | 1 | a#gmail.com | | Alex | local |
| 1 | 2 | b#gmail.com | 123-456-7890 | Bob | NULL |
| 1 | 4 | d#gmail.com | | Dan | NULL |
| 1 | 6 | f#gmail.com | | Fran | nation |
| 1 | 7 | g#gmail.com | 212-323-1111 | Greg | local |
| 1 | 10 | j#gmail.com | 945-555-1212 | Jeff | local |
+------------------------------------------------------------------------------------+
With this result set, I should be easily able to FOREACH my way through all 6 records and create a checkbox for each record, and flag those records with 'local' as checked. Can anyone help with setting up this complex query?
--
-- Table structure for table `contacts`
--
CREATE TABLE IF NOT EXISTS `contacts` (
`contact_id` int(12) NOT NULL AUTO_INCREMENT,
`member_id` int(12) NOT NULL,
`email` varchar(30) NOT NULL,
`phone` varchar(15) NOT NULL,
`name` varchar(30) NOT NULL,
PRIMARY KEY (`contact_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=13 ;
--
-- Dumping data for table `contacts`
--
INSERT INTO `contacts` (`contact_id`, `member_id`, `email`, `phone`, `name`) VALUES
(1, 1, 'a#gmail.com', '', 'Alex'),
(2, 1, 'b#gmail.com', '123-456-7890', 'Bob'),
(3, 3, 'c#gmail.com', '', 'Cris'),
(4, 1, 'd#gmail.com', '987-654-3210', 'Dan'),
(5, 2, 'e#gmail.com', '', 'Ed'),
(6, 1, 'f#gmail.com', '', 'Fran'),
(7, 2, 'h#gmail.com', '234-567-8901', 'Hans'),
(8, 3, 'i#gmail.com', '', 'Ida'),
(9, 1, 'g#gmail.com', '', 'Greg'),
(10, 1, 'j#gmail.com', '456-789-0123', 'Jeff'),
(11, 2, 'k#gmail.com', '945-555-1212 ', 'Karl'),
(12, 3, 'l#gmail.com', '', 'Leo');
CREATE TABLE IF NOT EXISTS `contacts_selected` (
`alert_id` int(12) NOT NULL AUTO_INCREMENT,
`contact_id` int(12) NOT NULL,
`alert_type` varchar(6) NOT NULL,
PRIMARY KEY (`alert_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=12 ;
--
-- Dumping data for table `contacts_selected`
--
INSERT INTO `contacts_selected` (`alert_id`, `contact_id`, `alert_type`) VALUES
(1, 1, 'local'),
(2, 1, 'state'),
(3, 3, 'state'),
(4, 5, 'local'),
(5, 5, 'state'),
(6, 6, 'nation'),
(7, 7, 'local'),
(8, 8, 'nation'),
(9, 10, 'local'),
(10, 12, 'state'),
(11, 1, 'nation');
CREATE TABLE IF NOT EXISTS `alert_types` (
`alert_type` varchar(6) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
--
-- Dumping data for table `alert_types`
--
INSERT INTO `alert_types` (`alert_type`) VALUES
('local'),
('state'),
('nation');
SOLUTION:
$alert_type = 'local';
// choices are local, state, nation
//
SELECT c.contact_id, c.member_id, c.email, c.phone, c.desc, s.alert_type
FROM contact c
LEFT JOIN contact_select s
ON c.contact_id = s.contact_id
WHERE c.member_id = 1 AND c.contact_id NOT IN
(SELECT cs.contact_id FROM contact_select cs WHERE cs.alert_type = '$alert_type')
GROUP BY c.contact_id
UNION
SELECT * FROM
(SELECT c.contact_id, c.member_id, c.email, c.phone, c.desc, s.alert_type
FROM contact c
LEFT JOIN contact_select s
ON c.contact_id = s.contact_id
WHERE c.member_id = 1
AND s.contact_id
IN (SELECT cs.contact_id FROM contact_select cs WHERE cs.alert_type = '$alert_type')) z
WHERE z.alert_type = '$alert_type'
This should give you your desired output.
SELECT C.member_id, C.contact_id, C.email, C.phone, C.name, S.alert_type
FROM CONTACTS C
LEFT OUTER JOIN CONTACTS_SELECTED S
ON C.contact_id = S.contact_id
WHERE member_id = 1
select member_id, cs.contract_id, c.email, c.phone, c.name, cs.alert_type
FROM contact c
LEFT JOIN contact_selected cs on cs.contact_id = c.contact_id
WHERE c.member_id = 1
Not sure i understand well what do you mean but maybe you're looking for thiS?
I would try this:
select `c`.`contact_id`, `member_id`, `email`, `phone`, `name`, `alert_type` from contacts `c`
left join contacts_selected `s` on `c`.`contact_id` = `s`.`contact_id`
where member_id=1
group by `c`.`contact_id`
However, two points: One, it's not clear to me how you want to narrow the result set to select only one of the alert types. Two, your sample data and your insert statments contain slightly different data. That's not a problem, but it is a little confusing at first.

Grouping MySQL data

I have this table, lets call it table one.
+----+---------+-----------------+
| id | link_id | url |
+----+---------+-----------------+
| 1 | 1 | www.example.com |
| 2 | 1 | www.abc.com |
| 3 | 1 | www.test.com |
| 4 | 1 | www.t1.com |
| 5 | 1 | www.newtest.com |
| 6 | 1 | www.testing.com |
| 7 | 1 | www.abc.com |
| 8 | 1 | www.example.com |
| 9 | 1 | www.web1.com |
| 10 | 1 | www.web2.com |
| 11 | 2 | www.dear.com |
| 12 | 2 | www.google.com |
| 13 | 2 | www.flowers.com |
| 14 | 2 | www.yahoo.com |
| 15 | 2 | www.abc.com |
| 16 | 2 | www.dell.com |
| 17 | 2 | www.web.com |
| 18 | 2 | www.example.com |
| 19 | 2 | www.test.com |
| 20 | 2 | www.abc.com |
+----+---------+-----------------+
20 rows in set (0.00 sec)
The link_id is sort the primary identifier in the table. It tells me which urls appear in link 1, link 2 , etc.
What I want to acomplish is:
1. Get all the unique URLs,
2. Show which links the URL belongs to
So an example output would be:
+-----------------+---------+
| url | link_id |
+-----------------+---------+
| www.example.com | 1 |
| www.example.com | 2 |
| www.abc.com | 1 |
| www.abc.com | 2 |
| www.test.com | 1 |
| www.test.com | 2 |
| www.t1.com | 1 |
| www.newtest.com | 1 |
| www.testing.com | 1 |
| www.web1.com | 1 |
...and so on.
So you can see that www.example.com appears twice since it is associated with both links 1 and 2, but web1.com appears only once since it belongs only to link 1.
I have tried several different group by but I only end up scratching my head even more.
Any help is appreciated. Here is the table dump if anyone needs:
CREATE TABLE IF NOT EXISTS `table1` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`link_id` tinyint(3) unsigned DEFAULT NULL,
`url` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=21 ;
INSERT INTO `table1` (`id`, `link_id`, `url`) VALUES
(1, 1, 'www.example.com'),
(2, 1, 'www.abc.com'),
(3, 1, 'www.test.com'),
(4, 1, 'www.t1.com'),
(5, 1, 'www.newtest.com'),
(6, 1, 'www.testing.com'),
(7, 1, 'www.abc.com'),
(8, 1, 'www.example.com'),
(9, 1, 'www.web1.com'),
(10, 1, 'www.web2.com'),
(11, 2, 'www.dear.com'),
(12, 2, 'www.google.com'),
(13, 2, 'www.flowers.com'),
(14, 2, 'www.yahoo.com'),
(15, 2, 'www.abc.com'),
(16, 2, 'www.dell.com'),
(17, 2, 'www.web.com'),
(18, 2, 'www.example.com'),
(19, 2, 'www.test.com'),
(20, 2, 'www.abc.com');
Wouldn't a DISTINCT list work? Does order matter?
SELECT DISTINCT url, link_id
FROM `table1`
ORDER BY 1, 2
Unless I'm misunderstanding the question, it sounds like all you need is a DISTINCT clause:
select distinct url, link_id from table1;
SELECT url, GROUP_CONCAT(link_id)
FROM table1
GROUP
BY url;
That'll give you all the distinct URLs, each with a list of link ids
Select url, link_id
From Table1
Group By url, link_id
select * from table group by link_id,url
Well imho you should group by both link_id and url, and than maybe sort by url so the same urls are together.
SELECT url, link_id FROM table1
ORDER BY url
GROUP BY url, link_id
Unless I'm missing something:
SELECT DISTINCT url, link_id FROM table1;