I have a list of TV shows. Each TV show may be blacked out in 0 or more timezones. To say that a show is "blacked out" in a timezone means that the network does not have rights to air the show in that timezone. This data looks like this:
|----|---------------------|
| ID | Show |
|----|---------------------|
| 1 | Nightly News |
| 2 | Primetime Sitcom |
| 3 | Daytime Talkshow |
| 4 | Nightly News II |
| 5 | Daytime Talkshow II |
| 6 | Nightly News III |
|----|---------------------|
|
|-----join
|
v
|----|----------------------|
| ID | Timezone Restriction |
|----|----------------------|
| 1 | EST |
| 1 | CST |
| 1 | PST |
| 2 | EST |
| 2 | CST |
| 3 | PST |
| 5 | CST |
| 5 | PST |
| 6 | HST |
|----|----------------------|
Not all shows are timezone restricted (most are not). Given this data, I need to fetch a list contains as many results as necessary in order to supply 2 shows in each timezone that are not blacked out. The results should be ordered by ID, with each timezone seeing the lowest possible unrestricted IDs.
For instance, in the above dataset, this hypothetical query would return rows 1-4, e.g:
|----|------------------|--------------|
| ID | Show | Restrictions |
|----|------------------|--------------|
| 1 | Nightly News | EST,CST,PST |
| 2 | Primetime Sitcom | EST,CST |
| 3 | Daytime Talkshow | PST |
| 4 | Nightly News II | None |
|----|------------------|--------------|
As you can see, in the above result set, all timezones have at least 2 shows which are unrestricted. A viewer in EST or CST could watch programs 3 and 4. A viewer in PST could view programs 2 and 4. A viewer in MST or HST could view programs 1 and 2.
I can't for the life of me figure out the SQL that would get at this problem (sidenote, I don't actually need the "restrictions" column in my result, that's just here for explanatory purposes).
Create a table that lists all the timezones. You can then CROSS JOIN this with the show list, to get all potential zones where a show could be shown. Then use a LEFT JOIN with the restrictions table to filter out the rows that match any restrictions, as described in Return row only if value doesn't exist.
SELECT s.show, z.zone
FROM shows AS s
CROSS JOIN timezones AS z
LEFT JOIN restrictions AS r ON r.id = s.id AND r.`Timezone Restriction` = z.zone
WHERE r.id IS NULL
ORDER BY z.zone, s.id
DEMO
This lists all the shows that can be shown in each timezone, not just the first 2. See Using LIMIT within GROUP BY to get N results per group? for how to restrict the number of results per group.
So having thought about this a bit more, I'm pretty sure the thing I want to do is 1) lookup a list of unrestricted shows for each timezone and 2) UNION them all together. This actually seems like pretty much exactly the usecase UNION was created for now that I think of it.
So I can get a single timezones unrestricted shows like so:
SELECT `shows`.`ID`
FROM shows
LEFT JOIN restrictions
ON `shows`.`ID`=`restrictions`.`ID`
AND `shows`.`ID` NOT IN (
SELECT `restrictions`.`ID`
FROM restrictions
WHERE `Timezone Restriction`='EST'
)
LIMIT 2
And then just chain them together like so:
(SELECT `shows`.`ID` FROM shows LEFT JOIN restrictions ON `shows`.`ID`=`restrictions`.`ID` AND `shows`.`ID` NOT IN (select `restrictions`.`ID` from restrictions where `Timezone Restriction`='EST') LIMIT 2)
UNION
(SELECT `shows`.`ID` FROM shows LEFT JOIN restrictions ON `shows`.`ID`=`restrictions`.`ID` AND `shows`.`ID` NOT IN (select `restrictions`.`ID` from restrictions where `Timezone Restriction`='CST') LIMIT 2)
UNION
(SELECT `shows`.`ID` FROM shows LEFT JOIN restrictions ON `shows`.`ID`=`restrictions`.`ID` AND `shows`.`ID` NOT IN (select `restrictions`.`ID` from restrictions where `Timezone Restriction`='MST') LIMIT 2)
UNION
(SELECT `shows`.`ID` FROM shows LEFT JOIN restrictions ON `shows`.`ID`=`restrictions`.`ID` AND `shows`.`ID` NOT IN (select `restrictions`.`ID` from restrictions where `Timezone Restriction`='PST') LIMIT 2)
UNION
(SELECT `shows`.`ID` FROM shows LEFT JOIN restrictions ON `shows`.`ID`=`restrictions`.`ID` AND `shows`.`ID` NOT IN (select `restrictions`.`ID` from restrictions where `Timezone Restriction`='HST') LIMIT 2)
ORDER BY ID;
Building on top of the sqlfiddle #Barmar supplied: http://www.sqlfiddle.com/#!9/25773/1/0
Related
So I have a table called the Activities table that contains a schema of user_id, activity
There is a row for each user, activity combo.
Here is a what it might look like (empty rows added to make things easier to look at, please ignore):
| user_id | activity |
|---------|-----------|
| 1 | swimming | -- We want to match this
| 1 | running | -- person's activities
| | |
| 2 | swimming |
| 2 | running |
| 2 | rowing |
| | |
| 3 | swimming |
| | |
| 4 | skydiving |
| 4 | running |
| 4 | swimming |
I would like to basically find all other users with at least the same activities as a given input id so that I could recommend users with similar activities.
so in the table above, if I wanna find recommended users for user_id=1, the query would return user_id=2 and user_id=4 because they engage in both swimming, running (and more), but not user_id=3 because they only engage in swimming
So a result with a single column of:
| user_id |
|---------|
| 2 |
| 4 |
is what I would ideally be looking for
As far as what I've tried, I am kinda stuck at how to get a solid set of user_id=1's activities to match against. Basically I'm looking for something along the lines of:
SELECT user_id from Activities
GROUP BY user_id
HAVING input_user_activities in user_x_activities
where user1_activities is just a set of our input user's activities. I can create that set using a WITH input_user_activities AS (...) in the beginning, what I'm stuck at is the user_x_activities part
Any thoughts?
To get users with the same activities, you can use a self join. Let me assume that the rows are unique:
select a.user_id
from activities a1 join
activities a
on a1.activity = a.activity and
a1.user_id = #user_id
group by a.user_id
having count(*) = (select count(*) from activities a1 where a1.user_id = #user_id);
The having clause answers your question -- of getting users that have the same activities as a given user.
You can easily get all users ordered by similarity using a JOIN (that finds all common rows) and a GROUP BY (to summarize the similarity per user_id) and finally an ORDER BY to return the most similar users first.
SELECT b.user_id, COUNT(*) similarity
FROM activities a
JOIN activities b
ON a.activity = b.activity
WHERE a.user_id = 1 AND b.user_id != 1
GROUP BY b.user_id
ORDER BY COUNT(*) DESC
An SQLfiddle to test with.
I have the following (simplified) three tables:
user_reservations:
id | user_id |
1 | 3 |
1 | 3 |
user_kar:
id | user_id | szak_id |
1 | 3 | 1 |
2 | 3 | 2 |
szak:
id | name |
1 | A |
2 | B |
Now I would like to count the reservations of the user by the 'szak' name, but I want to have every user counted only for one szak. In this case, user_id has 2 'szak', and if I write a query something like:
SELECT sz.name, COUNT(*) FROM user_reservations r
LEFT JOIN user_kar k ON k.user_id = r.user_id
LEFT JOIN szak s ON r.szak_id = r.id
It will return two rows:
A | 2 |
B | 2 |
However I want to every reservation counted to only one szak (lets say the highest id only). I tried MAX(k.id) with HAVING, but seems uneffective.
I would like to know if there is a supported method for that in MySQL, or should I first pick all the user ID-s on the backend site first, check their maximum kar.user_id, and then count only with those, removing them from the id list, when the given szak is counted, and then build the data back together on the backend side?
Thanks for the help - I was googling around for like 2 hours, but so far, I found no solution, so maybe you could help me.
Something like this?
SELECT sz.name,
Count(*)
FROM (SELECT r.user_id,
Ifnull(Max(k.szak_id), -1) AS max_szak_id
FROM user_reservations r
LEFT OUTER JOIN user_kar k
ON k.user_id = r.user_id
GROUP BY r.user_id) t
LEFT OUTER JOIN szak sz
ON sz.id = t.max_szak_id
GROUP BY sz.name;
I have a 'users' table:
user_id | prov_platform | first_name | last_name
--------|-----------------|--------------|-------------------
1 | Facebook | Joe | Bloggs
2 | Facebook | Sue | Barker
3 | | John | Doe
4 | Twitter | John | Terry
5 | Google | Angelina | Jolie
And I originally wanted to return a list of all the different social platform types there were in my users table, with counts beside each one - so I came up with this:
SELECT
IFNULL(prov_platform, 'Other') AS prov_platform,
COUNT(*) AS platform_total
FROM users
GROUP BY prov_platform
ORDER BY platform_total DESC
Which resulted in this:
prov_platform | platform_total
---------------|-----------------
Facebook | 2
Twitter | 1
Google | 1
Other | 1
But I now want to add another couple of fields to this query; 'allround_total' and 'percentage'. So, the above recordset would become:
prov_platform | platform_total | allround_total | percentage
---------------|----------------|----------------|---------------
Facebook | 2 | 5 | 40%
Twitter | 1 | 5 | 20%
Google | 1 | 5 | 20%
Other | 1 | 5 | 20%
This is as far as I got before getting in a muddle:
SELECT
u.prov_platform,
COUNT(*) AS platform_total,
allround_total,
allround_total/platform_total*100 AS percentage
FROM
users AS u
INNER JOIN (
SELECT COUNT(*) AS allround_total FROM users
) AS allround_total
GROUP BY
prov_platform
ORDER BY
platform_total DESC
This returns the 'allround_total' field, which works, but have no idea how performance friendly it'll be. What I can't workout is how to get the percentage to work correctly. Currently, the above query returns an error:
Unknown column 'platform_total' in 'field list'
I think I'm close, I just need a much appreciated push over the line.
You cannot use column aliases in the same level as they are defined. I also think you have the calculation for percentage backwards.
SELECT u.prov_platform, COUNT(*) AS platform_total,
const.allround_total,
100*count(*)/const.allround_total AS percentage
FROM users u cross join
(SELECT COUNT(*) as allround_total FROM users
) const
GROUP BY prov_platform
ORDER BY platform_total DESC;
I changed the join from inner join to cross join. Although MySQL allows all joins to lack an on clause, I find it disconcerting to see an inner join with no on. Similarly, I changed the name of the table alias to differ from the column alias, to make the query easier to read.
I have a table from which I am trying to retrieve the latest position for each security:
The Table:
My query to create the table: SELECT id, security, buy_date FROM positions WHERE client_id = 4
+-------+----------+------------+
| id | security | buy_date |
+-------+----------+------------+
| 26 | PCS | 2012-02-08 |
| 27 | PCS | 2013-01-19 |
| 28 | RDN | 2012-04-17 |
| 29 | RDN | 2012-05-19 |
| 30 | RDN | 2012-08-18 |
| 31 | RDN | 2012-09-19 |
| 32 | HK | 2012-09-25 |
| 33 | HK | 2012-11-13 |
| 34 | HK | 2013-01-19 |
| 35 | SGI | 2013-01-17 |
| 36 | SGI | 2013-02-16 |
| 18084 | KERX | 2013-02-20 |
| 18249 | KERX | 0000-00-00 |
+-------+----------+------------+
I have been messing with versions of queries based on this page, but I cannot seem to get the result I'm looking for.
Here is what I've been trying:
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE buy_date = (SELECT MAX(t2.buy_date)
FROM positions t2
WHERE t1.security = t2.security)
But this just returns me:
+-------+----------+------------+
| id | security | buy_date |
+-------+----------+------------+
| 27 | PCS | 2013-01-19 |
+-------+----------+------------+
I'm trying to get the maximum/latest buy date for each security, so the results would have one row for each security with the most recent buy date. Any help is greatly appreciated.
EDIT: The position's id must be returned with the max buy date.
You can use this query. You can achieve results in 75% less time. I checked with more data set. Sub-Queries takes more time.
SELECT p1.id,
p1.security,
p1.buy_date
FROM positions p1
left join
positions p2
on p1.security = p2.security
and p1.buy_date < p2.buy_date
where
p2.id is null;
SQL-Fiddle link
You can use a subquery to get the result:
SELECT p1.id,
p1.security,
p1.buy_date
FROM positions p1
inner join
(
SELECT MAX(buy_date) MaxDate, security
FROM positions
group by security
) p2
on p1.buy_date = p2.MaxDate
and p1.security = p2.security
See SQL Fiddle with Demo
Or you can use the following in with a WHERE clause:
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE buy_date = (SELECT MAX(t2.buy_date)
FROM positions t2
WHERE t1.security = t2.security
group by t2.security)
See SQL Fiddle with Demo
This is done with a simple group by. You want to group by the securities and get the max of buy_date. The SQL:
SELECT security, max(buy_date)
from positions
group by security
Note, this is faster than bluefeet's answer but does not display the ID.
The answer by #bluefeet has two more ways to get the results you want - and the first will probably be more efficient than your query.
What I don't understand is why you say that your query doesn't work. It seems pretty fine and returns the expected result. Tested at SQL-Fiddle
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE buy_date = ( SELECT MAX(t2.buy_date)
FROM positions t2
WHERE t1.security = t2.security ) ;
If the problems appears when you add the client_id = 4 condition, then it's because you add it only in one WHERE clause while you have to add it in both:
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE client_id = 4
AND buy_date = ( SELECT MAX(t2.buy_date)
FROM positions t2
WHERE client_id = 4
AND t1.security = t2.security ) ;
select security, max(buy_date) group by security from positions;
is all you need to get max buy date for each security (when you say out loud what you want from a query and you include the phrase "for each x", you probably want a group by on x)
When you use a group by, all columns in your select must either be columns that have been grouped by or aggregates, so if, for example, you wanted to include id, you'd probably have to use a subquery similar to what you had before, since there doesn't seem to be any aggregate you can reasonably use on the ids, and another group by would give you too many rows.
Here is what I'm trying to do. I have a table with user assessments which may contain duplicate rows. I'm looking to only get DISTINCT values for each user.
In the example of the table below. If only user_id 1 and 50 belongs to the specific location, then only the unique video_id's for each user should be returned as the COUNT. User 1 passed video 1, 2, and 1. So that should only be 2 records, and user 50 passed video 2. So the total for this location would be 3. I think I need to have two DISTINCT's in the query, but am not sure how to do this.
+-----+----------+----------+
| id | video_id | user_id |
+-----+----------+----------+
| 1 | 1 | 1 |
| 2 | 2 | 50 |
| 3 | 1 | 115 |
| 4 | 2 | 25 |
| 5 | 2 | 1 |
| 6 | 6 | 98 |
| 7 | 1 | 1 |
+-----+----------+----------+
This is what my current query looks like.
$stmt2 = $dbConn->prepare("SELECT COUNT(DISTINCT user_assessment.id)
FROM user_assessment
LEFT JOIN user ON user_assessment.user_id = user.id
WHERE user.location = '$location'");
$stmt2->execute();
$stmt2->bind_result($video_count);
$stmt2->fetch();
$stmt2->close();
So my query returns all of the count for that specific location, but it doesn't omit the non-unique results from each specific user.
Hope this makes sense, thanks for the help.
SELECT COUNT(DISTINCT ua.video_id, ua.user_id)
FROM user_assessment ua
INNER JOIN user ON ua.user_id = user.id
WHERE user.location = '$location'
You can write a lot of things inside a COUNT so don't hesitate to put what you exactly want in it. This will give the number of different couple (video_id, user_id), which is what you wanted if I understood correctly.
The query below joins a sub-query that fetches the distinct videos per user. Then, the main query does a sum on those numbers to get the total of videos for the location.
SELECT
SUM(video_count)
FROM
user u
INNER JOIN
( SELECT
ua.user_id,
COUNT(DISTINCT video_id) as video_count
FROM
user_assessment ua
GROUP BY
ua.user_id) uav on uav.user_id = u.user_id
WHERE
u.location = '$location'
Note, that since you already use bindings, you can also pass $location in a bind parameter. I leave this to you, since it's not part of the question. ;-)