How can I use self join to solve this SQL problem? - mysql

This is the question:
And an example:
This is the query I have:
SELECT a1.player_id, a1.event_date AS 'first_login'
FROM Activity a1, Activity a2
WHERE a1.event_date < a2.event_date
AND a1.player_id = a2.player_id;
The problem is that I don't get any player who just has one login instance.
Like player_id 2
Is there a way to just return a player_id if the player has just one instance of login?

You want to get the date of the first login of each user.
As a starter: the first solution that comes to mind here is aggregation:
select player_id, min(event_date) first_login
from activity
group by player_id
order by player_id
If you really need to do this with a self-join, I would recommend an anti-left join:
select a1.player_id, a1.event_date first_login
from activity a1
left join activity a2 on a2.player_id = a1.player_id and a2.event_date < a1.event_date
where a2.player_id is null
order by a1.player_id
The logic of the query is to ensure that there is no other row for the same user with an earliest event date ; the left join looks up such rows, then the where clause filters them out.

Related

Creating a SQL view with personal best records

I have the following SQL Database structure:
Users are the registered users. Maps are like circuits or race tracks. When a user is driving a time a new time record will be created including the userId, mapId and the time needed to finish the racetrack.
I wish to create a view where all the users personal bests on all maps are listed.
I tried creating the view like this:
CREATE VIEW map_pb AS
SELECT MID, UID, TID
FROM times
WHERE score IN (SELECT MIN(score) FROM times)
ORDER BY registered
This does not lead to the wished result.
Thank you for your help!
I hope that you have 'times' table created as the above diagram and 'score' column in the table that you use to measure the best record.
(MIN(score) is the best record).
You can simply create a view to have the personal best records using sub-queries like this.
CREATE VIEW map_pb AS
SELECT a.MID, a.UID, a.TID
FROM times a
INNER JOIN (
SELECT TID, UID, MIN(score) score
FROM times
GROUP BY UID
) b ON a.UID = b.UID AND a.score= b.score
-- if you have 'registered' column in the 'times' table to order the result
ORDER BY registered
I hope this may work.
You probably need to use a query that will first return the minimum score for each user on each map. Something like this:
SELECT UID,
MID,
MIN(score) AS best_time
FROM times
GROUP BY UID, MID
Note: I used MIN(score) as this is what is shown in your example query, but perhaps it should be MIN(time) instead?
Then just use the subquery JOINed to your other tables to get the output:
SELECT *
FROM (
SELECT UID,
MID,
MIN(score) AS best_time
FROM times
GROUP BY UID, MID
) a
INNER JOIN users u ON u.UID = a.UID
INNER JOIN maps m ON m.MID = a.MID
Of course, replace SELECT * with the columns you actually want.
Note: code untested but does give an idea as to a solution.
Start with a subquery to determine each user's minimum score on each map
SELECT UID, TID, MIN(time) time
FROM times
GROUP BY UID, TID
Then join that subquery into a main query.
SELECT times.UID, times.TID,
mintimes.time
FROM times
JOIN (
) mintimes ON times.TID = mintimes.TID
AND times.UID = mintimes.UID
AND times.time = mintimes.time
JOIN maps ON times.MID = maps.MID
JOIN users ON times.UID = users.UID
This query pattern uses a GROUP BY function to find the outlying (MIN in this case) value for each combination. It then uses that subquery to find the detail record for each outlying value.

MYSQL: country with most new users in January?

I have 2 tables, users and events:
**Users:**
usersid
age
geo_country
gender
**events:**
ts
usersid
event
videoid
Where ts is the timestamp field. And possible events are 'start_video', 'browse_catalog', 'exit_video'
I want to find out which country had the most new users in January.
My code is as follows:
SELECT DISTINCT (u.geo_country), COUNT(e.userid) As Users_Ids
FROM (SELECT userid, DATE(MIN(ts)) AS first_time
FROM events
WHERE ts BETWEEN '2017-01-01 00:00:00' and '2017-01-31 24:00:00'
GROUP BY userid) AS e
LEFT JOIN users u ON u.userid= e.userid
GROUP BY first_time
ORDER BY COUNT(e.userid) DESC;
Since I don't have the session field, is my subquery all right in providing new users for January 2017?
Any help would be highly appreciated.
Thanks,
Claudia
I think the query that you posted is slightly incorrect.
Theoretically, the GROUP BY should describe how to group the data set for the aggregate function. In your primary query, you want to count the number of users by country, so instead of grouped by first_time, the aggregation COUNT should go with GROUP BY u.geo_country, also, as a result, the DISTINCT on geo_country is no longer necessary.
The GROUP BY first_time will also providing wrong answers as it provides the count aggregation for number of users per unique first_time recorded not by unique country.
The correct query should be:
SELECT u.geo_country,
COUNT(e.userid) As Users_Ids
FROM (SELECT userid, DATE(MIN(ts)) AS first_time
FROM events
GROUP BY userid
HAVING first_time BETWEEN '2017-01-01 00:00:00' and '2017-01-31 24:00:00')
AS e
LEFT JOIN users u ON u.userid= e.userid
GROUP BY u.geo_country
ORDER BY Users_Ids DESC;

How do I consolidate/combine data when using a SQL join?

So I have a table called Events which looks like:
Id Date Title Location
1 2014 test New York
And another table called Quote Items:
ID Item_type cost event_id
1 paper 2 1
2 water 1 1
I have a simple join query like so:
select events.title, events.id, events.location, events.date, active_quote_items.cost
from active_quote_items
left join events
on active_quote_items.event_id=events.id
Which returns the data i want but each event can have multiple qoute items. I want to merge this data so that the cost of all items is consolidated in the column after the join. Is this possible, or is something similar possible?
You need Group by and Sum aggregate
SELECT events.title,
events.id,
events.location,
events.date,
Sum(active_quote_items.cost)
FROM active_quote_items
LEFT JOIN events
ON active_quote_items.event_id = events.id
GROUP BY events.title,
events.id,
events.location,
events.date
In SQL Server, you could also use the analytic function SUM() OVER to do this:
SELECT e.title
,e.id
,e.location
,e.[DATE]
,Sum(aq.cost) OVER (PARTITION BY aq.event_id)
FROM active_quote_items aq
LEFT JOIN events e ON aq.event_id = e.id;

How to use MAX and COUNT function in MYSQL with 2 tables

I am a newbie in MYSQL and had a question regarding the use of MAX and COUNT functions together in MYSQL. I have 2 tables worker and assignment and the primary key of worker is a foreign key in assignment table.
I need to show the employees name and id and the total assignment assigned to him, and only show the person with the most assignment that is the employee with the most assignment.
my code is
SELECT worker.Wrk_ID, worker.Wrk_LastName, MAX(a.count_id)
FROM worker,
(SELECT COUNT(assignment.Wrk_ID) as count_ID
FROM worker, assignment
WHERE worker.Wrk_ID = assignment.Wrk_ID
GROUP BY worker.Wrk_ID)as a
GROUP BY worker.Wrk_ID;
The code is giving an error no. #1054.
Please can anyone help me.
Thanking you in anticipation.
Try something like this:
SELECT worker.Wrk_ID, worker.Wrk_LastName, S.Count
FROM worker
JOIN
(SELECT Wrk_ID, COUNT(*) AS Count FROM Assignments
GROUP BY Wrk_Id ORDER BY COUNT(*) DESC LIMIT 1) S
ON worker.Wrk_ID = S.Wrk_ID
If you want a list of employees sorted by their total assignments:
SELECT w.WrkID, w.Wrk_LastName, COUNT(*) AS Assignments
FROM work w left join Assignments a
ON w.WrkID=a.WrkID
GROUP BY w.WrkID
ORDER BY COUNT(*) DESC;
To allow multiple winners:
SELECT s.*, w.Wrk_Lastname FROM
(
SELECT wrk_id , COUNT(*) AS tot_assignments
FROM Assignments
GROUP BY wrk_id
HAVING COUNT(*) =
(
SELECT MAX(tot) FROM
(
SELECT COUNT(*) AS TOT FROM Assignments GROUP BY wrk_id
) counts
)
) winners
INNER JOIN worker w ON s.wrk_id = w.wrk_id;
It can be slow since it does multiple GROUP BY. Doing it in separated steps in a procedure can be better.

MySQL evaluate case with subquery

I am trying to create a custom sort that involves the count of some records in another table. For example, if one record has no records associated with it in the other table, it should appear higher in the sort than if it had one or more records. Here's what I have so far:
SELECT People.*, Organizations.Name AS Organization_Name,
(CASE
WHEN Sent IS NULL AND COUNT(SELECT * FROM Graphics WHERE People.Organization_ID = Graphics.Organization_ID) = 0 THEN 0
ELSE 1
END) AS Status
FROM People
LEFT JOIN Organizations ON Organizations.ID = People.Organization_ID
ORDER BY Status ASC
The subquery within the COUNT is not working. What is the correct way to do something like this?
Update: I moved the case statement into the order by clause and added a join:
SELECT People.*, Organizations.Name AS Organization_Name
FROM People
LEFT JOIN Organizations ON Organizations.ID = People.Organization_ID
LEFT JOIN Graphics ON Graphics.Organization_ID = People.Organization_ID
GROUP BY People.ID
ORDER BY
CASE
WHEN Sent IS NULL AND Graphics.ID IS NULL THEN 0
ELSE 1
END ASC
So if if the People record does not have any graphics, Graphics.ID will be null. This achieves the immediate need.
If what you tried does not work, it can be done by joining against a subquery, and placing the CASE expression into ORDER BY as well:
SELECT
People.*,
orgcount.num
FROM People JOIN (
SELECT Organization_ID, COUNT(*) AS num FROM Graphics GROUP BY Organization_ID
) orgcount ON People.Organization_ID = orgcount.num
ORDER BY
CASE WHEN Sent IS NULL AND orgcount.num = 0 THEN 0 ELSE 1 END,
orgcount.num DESC
You could use an outer join to the Graphics table to get the data needed for your sort.
Since I don't know your schema, I made an assumption that the People table has a primary key column called ID. If the PK column has a different name, you should substitute that in the GROUP BY clause.
Something like this should work for you:
SELECT People.*, (count(Distinct Graphics.Organization_ID) > 0) as Status
FROM People
LEFT OUTER JOIN Graphics ON People.Organization_ID = Graphics.Organization_ID
GROUP BY People.ID
ORDER BY Status ASC
Fairly straight forward with a LEFT JOIN provided you have some kind of primary key in the People table to GROUP on;
SELECT p.*, sent IS NOT NULL or COUNT(g.Organization_ID) Status
FROM People p LEFT JOIN Graphics g ON g.Organization_ID = p.Organization_ID
GROUP BY p.primary_key
ORDER BY Status
Demo here.