How come RAND() is messing up in SQL subquery? - mysql

My goal is to select a random business and then with that business' id get all of their advertisements. I am getting unexpected results from my query. The number of advertisement rows returned is always what I assume is the value of "SELECT id FROM Business ORDER BY RAND() LIMIT 1". I have 3 businesses and only 1 business that has advertisement rows (5 of them) yet it always displays between 1-3 of the 5 advertisements for the same business.
SELECT * FROM Advertisement WHERE business_id=(SELECT id FROM Business ORDER BY RAND() LIMIT 1) ORDER BY priority
Business TABLE:
Advertisement TABLE:
Data for Advertisement and Business tables:
INSERT INTO `Advertisement` (`id`, `business_id`, `image_url`, `link_url`, `priority`) VALUES
(1, 1, 'http://i64.tinypic.com/2w4ehqw.png', 'https://www.dennys.com/food/burgers-sandwiches/spicy-sriracha-burger/', 1),
(2, 1, 'http://i65.tinypic.com/zuk1w1.png', 'https://www.dennys.com/food/burgers-sandwiches/prime-rib-philly-melt/', 2),
(3, 1, 'http://i64.tinypic.com/8yul3t.png', 'https://www.dennys.com/food/burgers-sandwiches/cali-club-sandwich/', 3),
(4, 1, 'http://i64.tinypic.com/o8fj9e.png', 'https://www.dennys.com/food/burgers-sandwiches/bacon-slamburger/', 4),
(5, 1, 'http://i68.tinypic.com/mwyuiv.png', 'https://www.dennys.com/food/burgers-sandwiches/the-superbird/', 5);
INSERT INTO `Business` (`id`, `name`) VALUES
(1, 'Test Dennys'),
(2, 'Test Business 2'),
(3, 'Test Business 3');

You're assuming your query does something it doesn't do.
(SELECT id FROM Business ORDER BY RAND() LIMIT 1) isn't materialized at the beginning of the query. It's evaluated for each row... so for each row, we're testing whether that business_id matches the result of a newly-executed instance of the subquery. More thorough test data (more than one business included) should reveal this.
You need to materialize the result into a derived table, then join to it.
SELECT a.*
FROM Advertisement a
JOIN (
SELECT (SELECT id
FROM Business
ORDER BY RAND()
LIMIT 1) AS business_id
) b ON b.business_id = a.business_id;
The ( SELECT ... ) x construct creates a temporary table that exists only for the duration of the query and uses the alias x. Such tables can be joined just like real tables.
MySQL calls this a Subquery in the FROM Clause.

Try following query
SELECT * FROM Advertisement WHERE business_id = (select floor(1 + rand()* (select count(*) from Business)));

To retrieve rows in random order use SELECT * Instead Of Id and then query for its id.
SELECT * FROM Advertisement WHERE business_id=(SELECT ID FROM (SELECT * FROM Business ORDER BY RAND() LIMIT 1) as table1)
In this case with your example data, only when rand returns 1 you get results.

Related

How to insert into a row, a value from other table?

I have 3 Mysql tables.
A table with the classes and the labs and their id.
A table with the teachers_list and their subject.
A table which is going to be the schedule.**
I want to randomly assign one of the physicists to one of the physics labs on my third table which is going to be the schedule.
INSERT INTO schedule(teacher_name, class_id)
VALUES (select teacher_name from teachers_list where subject="Physicist” order by rand() limit 1,
select id from lab_list where lab="Physics_lab" order by rand() limit 1);
**This one doesn't work :(
Can you help me?**
I think that you want the insert ... select syntax, along with a subquery:
insert into schedule(teacher_name, class_id)
select
(
select teacher_name
from teachers_list
where subject = 'Physicist'
order by rand()
limit 1
),
id
from lab_list
where lab = 'Physics_lab'

SQL question. Find the two person having same hobbies in one table

TABLE [tbl_hobby]
person_id (int) , hobby_id(int)
has many records. I want to get a SQL query to find all pairs of personid who have the same hobbies( same hobby_id ).
If A has hobby_id 1, B has too, if A doesn't have hobby_id 2, B doesn't have too, we will output A & B 's person_ids.
If A and B and C reach the limits, we output A & B , B & C, A & C.
I've finished in a very very very stupid method, multiple joins the table itself and multiple sub-queries. And of course be laughed by leader.
Is there any high performance method in a SQL for this question?
I have been thinking hard for this since 36 hrs ago......
sample data in mysql dump
CREATE TABLE `tbl_hobby` (
`person_id` int(11) NOT NULL,
`hobby_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `tbl_hobby` (`person_id`, `hobby_id`) VALUES
(1, 1),(1, 2),(1, 3),(1, 4),(1, 5),(2, 2),
(2, 3),(2, 4),(3, 1),(3, 2),(3, 3),(3, 4),
(4, 1),(4, 3),(4, 4),(5, 1),(5, 5),(5, 9),
(6, 2),(6, 3),(6, 4),(7, 1),(7, 3),(7, 7),
(8, 2),(8, 3),(8, 4),(9, 1),(9, 2),(9, 3),
(9, 4),(10, 1),(10, 5),(10, 9),(10, 11);
COMMIT;
Expert result: (2 and 6 and 8 same, 3 and 9 same)
2,6
2,8
6,8
3,9
Order of result records and order of the two number in one record is not important. Result record in one column or in two columns are all accepted since it can be easily concated or seperated.
Aggregate per person to get strings of their hobbies. Then aggregate per hobby list find out which belong to more than one person.
select hobbies, group_concat(person_id order by person_id) as persons
from
(
select person_id, group_concat(hobby_id order by hobby_id) as hobbies
from tbl_hobby
group by person_id
) persons
group by hobbies
having count(*) > 1
order by hobbies;
This gives a a list of persons per hobby. Which is the easiest way to output a solution as we would otherwise have to build all possible pairs.
UPDATE: If you want pairs, you'll have to query the table twice:
select p1.person_id as person 1, p2.person_id as person2
from
(
select person_id, group_concat(hobby_id order by hobby_id) as hobbies
from tbl_hobby
group by person_id
) p1
join
(
select person_id, group_concat(hobby_id order by hobby_id) as hobbies
from tbl_hobby
group by person_id
) p2 on p2.person_id > p1.person_id and p2.hobbies = p1.hobbies
order by person1, person2;
Alternative version, without using any proprietary string handling:
select distinct t1.person_id, t2.person_id
from tbl_hobby t1
join tbl_hobby t2
on t1.person_id < t2.person_id
where 2 = all (select count(*)
from tbl_hobby
where person_id in (t1.person_id, t2.person_id)
group by hobby_id);
Perhaps less efficient, but portable!

One row per group with multiple column sorting

Would like to return one row per group, where the one is selected by multiple sort columns. Treading lightly here in the land of greatest-n-per-group to avoid a duplicate question.
SCHEMA:
CREATE TABLE logs (
id INT NOT NULL,
ip_address INT NOT NULL,
status INT NOT NULL,
PRIMARY KEY id
);
DATA:
INSERT INTO logs (id, ip_address, status)
VALUES ('1', 19216800, 1),
('2', 19216801, 2),
('3', 19216800, 2),
('4', 19216803, 0),
('5', 19216804, 0),
('6', 19216803, 0),
('7', 19216804, 1);
CURRENT QUERY:
SELECT *
FROM logs
ORDER BY ip_address, status=1 DESC, id DESC
Note: sorting by status=1 effectively turns the status column into a boolean. The tie breaker after status=1 is id. This query currently returns the correct row for each ip_address first and then a bunch of other rows I don't want for that ip_address.
CURRENT OUTPUT:
1, 19216800, 1
3, 19216800, 2
2, 19216801, 2
6, 19216803, 0
4, 19216803, 0
7, 19216804, 1
5, 19216804, 0
WANTED OUTPUT:
1, 19216800, 1
2, 19216801, 2
6, 19216803, 0
7, 19216804, 1
Today my workaround is to filter in PHP with if ($lastIP == $row['ip_address']) continue;. But I would like to move this logic to MySQL.
Try this -
SELECT MIN(id), ip_address, status
FROM logs
GROUP BY ip_address, status
Since there are already hundreds of solutions for greatest-n-per-group problems in MySQL, I'm going to start answering these questions with CTE syntax with window functions, since that is now available in MySQL 8.0.3.
WITH sorted AS (
SELECT id, ip_address, status,
ROW_NUMBER() OVER (PARTITION BY ip_address ORDER BY status) AS rn
FROM logs
)
SELECT * FROM sorted WHERE rn = 1;
Here is different way to think about the problem. You want to find the "best" row for each id_address. Or in other words, you want to select rows where no better row exists.
This solution works for MySQL versions before 8.0. In other words, it works with the version you already have installed with RHEL 7. You can extend this technique easily for an arbitrary number of sort columns.
SELECT a.*
FROM (SELECT * FROM logs) a
LEFT JOIN (SELECT * FROM logs) b
ON (b.ip_address = a.ip_address AND (b.stat=1) > (a.stat=1))
OR (b.ip_address = a.ip_address AND (b.stat=1) = (a.stat=1) AND b.id > a.id)
WHERE b.id IS NULL
ORDER BY a.ip_address
If you have more columns to sort by then keeping adding OR clauses to handle tie breaks and select the "best" row for each ip_address. Regardless how complicated your subquery is or how many "SORT BY~ conditions you have, you will only need one LEFT JOIN with this technique.
Try this:
SELECT
l.`ip_address` , l.`status`
FROM
`logs` l
GROUP BY l.`ip_address`
ORDER BY l.`status` = 1 DESC

How can I find which two rows have timestamps closest to each other?

I'm building a web application for location-based check ins, sort of like a local 4square, but based on RFID tags.
Anyway, each check-in is stored in a MySQL table with a userID and the time of the check-in as a DATETIME column.
Now I'd like to show which users have the closest check-in times between different stations.
Explanation: Let's say user A checked in at 21:43:12 and then again at 21:43:19. He moved between stations in 7 seconds.
There are thousands of check-ins in the database, how do I write SQL to select the users with the two closest check-in times?
Try this:
select
a.id,
b.id,
abs(a.rfid-b.rfid)
from
table1 a,
table1 a
where
a.userID=b.userID
// and any other conditions to make it a single user
group by
a.id,
b.id,
a.rfid,
b.rfid
order by
abs(a.rfid-b.rfid) desc
limit 1
Really fast solution would introduce some precalculations. Like storing the difference between current and previous checkins.
In this case you would select what you need in fast manner (as long as you cover that column by index).
Not using precalculation in this case would cause terrible queries that would operate over cartesian-like productions.
What have you tried? Have you looked at DATEDIFF
http://msdn.microsoft.com/en-us/library/ms189794.aspx
Cheers
--Jocke
First, you want an index on the user and then the timestamp.
Second, you need to use correlated sub-queries to find "the next timestamp".
Then you use GROUP BY to find the smallest interval per user.
SELECT
a.user_id,
MIN(TIMEDIFF(b.timestamp, a.timestamp)) AS min_duration,
FROM
checkin AS a
INNER JOIN
checkin AS b
ON b.user_id = a.user_id
AND b.timestamp = (SELECT MIN(timestamp)
FROM checkin
WHERE user_id = a.user_id
AND timestamp > a.timestamp)
GROUP BY
a.user_id
ORDER BY
min_duration
LIMIT
1
If you want to allow for multiple users with the same min_duration, I recommend storing the results (without the LIMIT 1) in a temporary table, then searching that table for all users that share the minimum duration.
Depending on the volume of data, this could be slow. One optimisation would be to cache the results of the TIMEDIFF(). Every time a new checkin is recorded, also calculate and store the duration since the last checkin, maybe using triggers. Having this pre-calculated makes the query simpler and the values indexable.
I figure, you only want to compute the difference between two checkins if, they are two consecutive checkins of the same person.
create table test (
id int,
person_id int,
checkin datetime);
insert into test (id, person_id, checkin) values (1, 1, now());
insert into test (id, person_id, checkin) values (2, 1, now());
insert into test (id, person_id, checkin) values (3, 2, now());
insert into test (id, person_id, checkin) values (4, 2, now());
insert into test (id, person_id, checkin) values (5, 1, now());
insert into test (id, person_id, checkin) values (6, 2, now());
insert into test (id, person_id, checkin) values (7, 1, now());
select * from (
select a.*,
(select a.checkin - b.checkin
from test b where b.person_id = a.person_id
and b.checkin < a.checkin
order by b.checkin desc
limit 1
) diff
from test a
where a.person_id = 1
order by a.person_id, a.checkin
) tt
where diff is not null
order by diff asc;
SELECT a.*, b.*
FROM table_name AS a
JOIN table_name AS b
ON a.id != b.id
ORDER BY TIMESTAMPDIFF(SECOND, a.checkin, b.checkin) ASC
LIMIT 1
Should do it. Might be a bit laggy as mentioned.

How to select 5 distinct rows

How do I select 5 rows, 1 for each site_id, this is throwing an error
SELECT DISTINCT site_id, *
FROM deal
WHERE site_id IN (2, 3, 4, 5, 6)
ORDER BY id
DESC LIMIT 5
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '* FROM deal WHERE site_id IN (2, 3, 4, 5, 6) ORDER BY id DESC LIMIT 5' at line 1"
Try using GROUP BY
SELECT *
FROM deal
WHERE site_id IN (2, 3, 4, 5, 6)
GROUP BY site_id
ORDER BY id DESC
LIMIT 5;
First get one particular deal ids (the maximum one) for each site. (The inner query.)
Then get the full row for each of those deal ids. (The outer query.)
SELECT * FROM deal
WHERE id in (
SELECT max(id) maxid
FROM deal
WHERE site_id IN (2, 3, 4, 5, 6)
GROUP_BY site_id
)
You can remove following line if you're really interested in getting one row for each site for all of the sites in the database.
WHERE site_id IN (2, 3, 4, 5, 6)
If your table allows duplicate site_ids, and you only need to show one per site_id, then assuming ID is unique
SELECT * FROM deal
WHERE id in (
SELECT max(id) maxid
FROM deal
WHERE site_id IN (2, 3, 4, 5, 6)
GROUP by site_id
)
You can't specify DISTINCT site_id and also include the * wildcard. If you want to specify the distinct site_id, then you need to remove the wildcard and specify the other fields you want to use.