MySQL efficient join with "in" without duplicates - mysql

I have an event table
+----------+------------+
| event_id | event_name |
+----------+------------+
| 1 | event1 |
| 2 | event2 |
| 3 | event3 |
| 4 | event4 |
+----------+------------+
And an event_performer table
+--------------------+----------+--------------+
| event_performer_id | event_id | performer_id |
+--------------------+----------+--------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
| 4 | 2 | 2 |
| 5 | 3 | 3 |
| 6 | 3 | 4 |
| 7 | 4 | 3 |
| 8 | 4 | 4 |
+--------------------+----------+--------------+
I want to get all the events with performer ids 1 and 2, so I run the following query:
select event.* from event
join event_performer
on event.event_id = event_performer.event_id
and performer_id in (1,2)
order by event_name
When I do that, I obviously get duplicate events (two for event_id 1 and two for event_id 2). What's the most efficient way in MySQL remove the duplicates so that I get each event record once?
One idea is to use select distinct event.* How efficient is that for a large number of fields and records?
Note that the example tables are oversimplified. Each table has MANY more fields and MANY more records.

You can try to use Group BY
select event.event_id, event.event_name from event
join event_performer
on event.event_id = event_performer.event_id
and performer_id in (1,2)
GROUP BY event_name, event.event_id
order by event_name
However do note that if you have created an index on your column then using GROUP BY is similar to using DISTINCT. But if you have not created an index then I would recommend you to go with using DISTINCT as it is comparatively faster than GROUP BY.

Related

MySQL How to Select smth by MAX(id)....WHERE userID = some number GROUP BY smth

I have next table in my DB:
personal_prizes
___________ ___________ _________ __________
| id | userId | specId| grp |
|___________|___________|_________|__________|
| 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 1 |
| 3 | 2 | 3 | 1 |
| 4 | 2 | 4 | 2 |
| 5 | 1 | 5 | 2 |
| 6 | 1 | 6 | 2 |
| 7 | 2 | 7 | 3 |
| 8 | 1 | 13 | 4 |
|___________|___________|_________|__________|
I need to select specId by max id group by grp.
So I have composed next query
SELECT pp.specId
FROM personal_prizes pp
WHERE pp.specId IN (SELECT MAX(pp1.id)
FROM personal_prizes pp1
WHERE pp1.userId = 1
GROUP BY pp1.grp)
And it's work for my little table. But when I try to implement it for my prod db with personal_prizes > 100,000.
Please help me optimize it
The query you have should work fine. Make sure though that you not only have an index on id (which I suppose is the primary key), but also one on specId.
Just as an alternative, you might try this one:
select group_concat(pp.specId order by pp1.id desc)+0 as result_specId
from personal_prizes pp1
left join personal_prizes pp on pp.specId = pp1.id
where pp1.userId = 1
group by pp1.grp
having result_specId is not null;
The idea here is that the sub query is promoted to the main query, and the specId is retrieved by an outer join. The group_concat aggregation function will list the one of interest as the first. The having clause will exclude the cases where no matching specId was found.
Note that this will only give the same results if the specId field is guaranteed to be non-null.

MySQL Subselect Issue with two tables and aggregate functions into single query

I have 2 tables
Transaction table
+----+----------+-----+---------+----
| TID | CampaignID | DATE |
+----+----------+-----+---------+---+
| 1 | 5 | 2016-01-01 |
| 2 | 5 | 2016-01-01 |
| 3 | 2 | 2016-01-01 |
| 4 | 5 | 2016-01-01 |
| 5 | 1 | 2016-01-01 |
| 6 | 1 | 2016-02-02 |
| 7 | 3 | 2016-02-02 |
| 8 | 3 | 2016-02-02 |
| 9 | 5 | 2016-02-02 |
| 10| 4 | 2016-02-02 |
+----+----------+-----+---------+---+
Campaign Table
+-------------+----------------+--------------------
| CampaignID | DailyMaxImpressions | CampaignActive
+-------------+----------------+--------------------
| 1 | 5 | Y |
| 2 | 5 | Y |
| 3 | 5 | Y |
| 4 | 5 | Y |
| 5 | 1 | Y |
+-------------+----------------+--------------------
What I am trying to do is get a single random campaign where the the count in transaction table is less than the daily max impressions in the campaign table. I might also be passing a date s part of the query for the transaction table
So for CampaignId 1 there must be 4 trans of less in the transaction table and the Campaignactive must be a "Y"
Any help would be appreciated if this can be done in a single statement. ( mysql )
Thanks in advance,
Jeff Godstein
This should get it for you. The basic query is select each campaign that is active. The INNER query will pre-aggregate per campaign for the given date in question. From that, a LEFT-JOIN allows any campaign to be returned even if it does NOT exist within the subquery OR it DOES exist, but the count is less than that allowed for the date in question. The order by RAND() is obvious.
SELECT
c.CampaignID
from
Campaign c
LEFT JOIN
( select
t1.CampaignID,
count(*) as CampCount
from
Transaction t1
where
t1.Date = YourDateParameterValue
group by
t1.CampaignID ) as T
ON c.CampaignID = T.CampaignID
where
c.CampaignActive = 'Y'
AND ( t.CampaignID IS NULL
OR t.CampCount < c.DailyMaxImpressions )
order by
RAND()

MySQL query SELECT FROM 2 tables, COUNT the most used

I have this 2 tables and I need to return the moset used office. Note: 1 office can be used by more than 1 guys and the column ido from TableB is populate from TableA
Probaly is a query with group by and desc limit 1
TableA
| ido| office | guy |
---------------------
| 1 | office1| guy1|
| 2 | office2| guy2|
| 3 | office1| guy3|
| 4 | office1| guy4|
| 5 | office5| guy5|
| 6 | office2| guy6|
TableB
| idb| vizit | ido|
---------------------
| 1 | date | 4 |
| 2 | date | 2 |
| 3 | date | 5 |
| 4 | date | 6 |
| 5 | date | 1 |
| 6 | date | 6 |
Thanks!
You were correct in that GROUP BY, LIMIT and DESC are useful here; it leads to a fairly straight forward query;
SELECT TableA.office
FROM TableA
JOIN TableB
ON TableA.ido = TableB.ido
GROUP BY TableA.office
ORDER BY COUNT(*) DESC
LIMIT 1
What it does is basically create rows with all valid combinations, counting the number of generated rows per office. A plain descending sort by that count will give you the most frequently used office.
An SQLfiddle to test with.

Select Distinct Set Common to Subset From Join Table

Given a join table for m-2-m relationship between booth and user
+-----------+------------------+
| booth_id | user_id |
+-----------+------------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 5 |
| 1 | 9 |
| 2 | 1 |
| 2 | 2 |
| 2 | 5 |
| 2 | 10 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
| 3 | 4 |
| 3 | 6 |
| 3 | 11 |
+-----------+------------------+
How can I get a distinct set of booth records that are common between a subset of user ids? For example, if I am given user_id values of 1,2,3, I expect the result set to include only booth with id 3 since it is the only common booth in the join table above between all user_id's provided.
I'm hoping I'm missing a keyword in MySQL to accompish this. The furthest I've come so far is using ... user_id = all (1,2,3) but this is always returning an empty result set (I believe I understand why it is though).
The SQL query for this will be:
select booth_id from table1 where [user_id]
in (1,2,3) group by booth_id having count(booth_id) =
(select count(distinct([user_id])) from table1 where [user_id] in (1,2,3))
If this could help you creating the MySQL query.

SQL select all records from one table which appear < 3 times in another

I have got two tables:
Event
+----------+---------+--------------+
| event_id | name | date |
+----------+------+-----------------+
| 1 | Event 1 | 26/03/2012 |
+----------+------+-----------------+
| 2 | Event 2 | 27/03/2012 |
+----------+------+-----------------+
Reservation
+----------------+------------+--------------+
| reservation_id | date | themed_id |
+----------------+------------+--------------+
| 1 | 26/03/2012 | 1 |
+----------------+------------+--------------+
| 2 | 26/03/2012 | 1 |
+----------------+------------+--------------+
| 3 | 27/03/2012 | 2 |
+----------------+------------+--------------+
| 4 | 26/03/2012 | 1 |
+----------------+------------+--------------+
How will I display all the events which appear less than 3 times in the reservation table.
The output will be:
+----------+---------+--------------+
| event_id | name | date |
+----------+------+-----------------+
| 2 | Event 2 | 27/03/2012 |
+----------+------+-----------------+
As event two has only appeared once in reservation
thanks
SELECT *
FROM Event
WHERE event_id IN (
SELECT themed_id
FROM Reservation
GROUP BY themed_id
HAVING COUNT(*) < 3)
I have not tested but the base idea is this.
I'm guessing Event.event_id = Reservation.themed_id? If so:
Edit: Changed to a LEFT JOIN to include events with 0 reservations.
SELECT
Event.event_id,
Event.name,
Event.date
FROM
Event
LEFT JOIN Reservation ON Event.event_id = Reservation.themed_id
GROUP BY
Event.event_id
HAVING
COUNT(DISTINCT Reservation.reservation_id) < 3