SQL query to select 3 consecutive years between records

SQL query to select 3 consecutive years between records - mysql

I have a database full of all sorts of records regarding baseball teams and their players. I need to write a query that shows the names of any player who has played for the teamID "MON" three consecutive years. I've already written a query that gives me the table below, showing the years they played for that team.
| nameFirst | nameLast| Year |
+-----------+---------+-------+
| Santo | Alcala | 1977 |
| Santo | Alcala | 1978 |
| Santo | Alcala | 1979 |
| Scott | Aldred | 1993 |
I'm too lazy to enter any more records in the table, but this should be plenty to understand the situation. The actual table in my DB has thousands of records. So the query I need would return one record for Santo Alcala since he played three consecutive years for the MON team. The above table only shows players who played for MON, I already wrote a query that excludes all players who played for teams other than MON.
The desired output of the query would be a record such as:
| nameFirst | nameLast|
+--------------+---------+
| Santo | Alcala |
If a player played for more than 3 consecutive years on the team, they would also be shown in the results.

Are you looking for something like the below?
Schema
CREATE TABLE PLAYER (
ID INT,
FIRST VARCHAR(25),
LAST VARCHAR(25),
YEAR INT
);
INSERTS
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (1, "Santo", "Alcala", 1977);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (2, "Santo", "Alcala", 1978);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (3, "Santo", "Alcala", 1979);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (4, "Santo", "Alcala", 1980);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (5, "Santo", "Aldred", 1993);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (6, "Santo", "Aldred", 1994);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (7, "Santo", "Royal", 1994);
Query
select DISTINCT(FIRST), LAST from player where ID
IN (select p1.ID from player p1 inner join player p2
on p1.year = p2.year+1 and p1.first = p2.first and p2.last = p1.last);
OUTPUT
FIRST LAST
Santo Alcala
Santo Aldred
SQL FIDDLE
http://sqlfiddle.com/#!9/b5c1c4/1

Related

MYSQL multiple conditional statements for count

I'm very new to MYSQL, have looked at many answers on this site but can't get the following to work...
Table is "member"
3 fields are "id" (Integer); and 2 date fields "dob" and "expiry"
I need to count the number of records where all are current members, ie
expiry<curdate()
then I need to know the count of records with the following conditions:
year(curdate())-year(dob) <25 as young
year(curdate())-year(dob) >25 and <=50 as Medium
year(curdate())-year(dob) >50 as Older
So I expect to get a single row with many columns and the count of each of these conditions.
Effectively I'm filtering current members for their age grouping.
I've tried a subquery but failed to get that to work.
Thanks

If you really want the end result as you have mentioned, you could use views. It takes a long way to achieve the result. However, here is the way. I created the following table member and inserted data as follows.
CREATE TABLE member (
id int(11) AUTO_INCREMENT PRIMARY KEY,
dob date DEFAULT NULL,
expiry date DEFAULT NULL
);
INSERT INTO member (id, dob, expiry) VALUES
(1, '1980-01-01', '2020-05-05'),
(2, '1982-05-05', '2020-01-01'),
(3, '1983-05-05', '2020-01-01'),
(4, '1981-05-05', '2020-01-01'),
(5, '1994-05-05', '2020-01-01'),
(6, '1992-05-05', '2020-01-01'),
(7, '1960-05-05', '2020-01-01'),
(8, '1958-05-05', '2020-01-01'),
(9, '1958-07-07', '2020-05-05');
Following is the member table with data.
id | dob | expiry
--------------------------------
1 | 1980-01-01 | 2020-05-05
2 | 1982-05-05 | 2020-01-01
3 | 1983-05-05 | 2020-01-01
4 | 1981-05-05 | 2020-01-01
5 | 1994-05-05 | 2020-01-01
6 | 1992-05-05 | 2020-01-01
7 | 1960-05-05 | 2020-01-01
8 | 1958-05-05 | 2020-01-01
9 | 1958-07-07 | 2020-05-05
Then I created a separate view for all the current employees named as current_members as follows.
CREATE VIEW current_members AS (SELECT * FROM member WHERE TIMESTAMPDIFF(YEAR, CAST(CURRENT_TIMESTAMP AS DATE), member.expiry) >= 0);
Then querying from that view, I created 3 separate views containing counts for each age ranges of young, middle and old as follows.
CREATE VIEW young AS (SELECT COUNT(*) as Young FROM (SELECT TIMESTAMPDIFF(YEAR, current_members.dob, CAST(CURRENT_TIMESTAMP AS DATE)) AS age FROM current_members HAVING age <= 25) yng);
CREATE VIEW middle AS (SELECT COUNT(*) as Middle FROM (SELECT TIMESTAMPDIFF(YEAR, current_members.dob, CAST(CURRENT_TIMESTAMP AS DATE)) AS age FROM current_members HAVING age BETWEEN 25 AND 50) mid);
CREATE VIEW old AS (SELECT COUNT(*) as Old FROM (SELECT TIMESTAMPDIFF(YEAR, current_members.dob, CAST(CURRENT_TIMESTAMP AS DATE)) AS age FROM current_members HAVING age >= 50) old);
Finally, the three views were cross joined in order to get the counts of each age range into a single row of one final table as follows.
SELECT * FROM young, middle, old;
This will give you the following result.
Young | Middle | Old
----------------------
2 | 4 | 3
SUGGESTION : FOR THE ABOVE TEDIOUS TIME DIFFERENCE CALCULATIONS, YOU COULD WRITE YOUR OWN STORED PROCEDURE TO SIMPLIFY THE CODE

Get a "chain of events" in SQL

I have a table that looks like this:
CustomerID | ContactTime | AttemptResult
-----------+-----------------+-----------------
1 | 1/1/2016 5:00 | Record Started
1 | 1/1/2016 6:00 | Appointment
2 | 1/2/2016 5:00 | Record Started
1 | 1/3/2016 6:00 | Sold
2 | 1/2/2016 5:00 | Sold
3 | 1/4/2016 5:00 | Record Started
3 | 1/4/2016 6:00 | Sold
From
create table #temp1
(
CustomerID int,
ContactTime datetime,
Result nvarchar(50)
)
insert into #temp1 values (1, '1/1/2016 5:00', 'Record Started')
insert into #temp1 values (1, '1/1/2016 6:00', 'Appointment')
insert into #temp1 values (2, '1/2/2016 5:00', 'Record Started')
insert into #temp1 values (1, '1/3/2016 6:00', 'Sold')
insert into #temp1 values (2, '1/2/2016 5:00', 'Sold')
insert into #temp1 values (3, '1/4/2016 5:00', 'Record Started')
insert into #temp1 values (3, '1/4/2016 6:00 ', 'Sold')
How can I query this in a way that gets all combinations in order of AttemptResults ? So something like:
CustID | Sequence
-------+--------------------------------------
1 | Record Started -> Appointment -> Sold
2 | Record Started -> Sold
3 | Record Started -> Sold
I'm not even sure where to start...

If this is your complete dataset, I can help you. Otherwise I would need to see more. Use something called a Window Function. Below esentially indexes or keeps track of how many entries there are for each CustomerID.
Select *, row_number() over (partition CustomerID group by ContactTime) as Combo
into #temp
from table
Then count how many combos of 2 happen (Record Started->Sold), combos of 3 (Record Started -> Appointment -> Sold )
Select CustomerID, max(Combo) as MaxCombo
into #temp1
from #temp
group by CustomerId
Select MaxCombo, count(*)
from #temp1
group by MaxCombo
You could also use Common Table Expressions instead of these temp tables but I didnt want to add too much confusion.

Find sum of stacked/overlapping date intersections in SQL table

I have the following table which represents bookings of articles:
+---+------------+----------+-------------+-------------+
|id | article_id | quantity | starts_at | ends_at |
+---+------------+----------+-------------+-------------+
| 1 | 1 | 1 | 2015-03-01 | 2015-03-20 |
| 2 | 1 | 2 | 2015-03-02 | 2015-03-03 |
| 3 | 1 | 3 | 2015-03-04 | 2015-03-15 |
| 4 | 1 | 2 | 2015-03-16 | 2015-03-22 |
| 5 | 1 | 2 | 2015-03-11 | 2015-03-19 |
| 6 | 2 | 2 | 2015-03-06 | 2015-03-22 |
| 7 | 2 | 3 | 2015-03-02 | 2015-03-04 |
+---+------------+----------+-------------+-------------+
From this table I want to extract the following information:
+------------+----------+
| article_id | sum |
+------------+----------+
| 1 | 6 |
| 2 | 3 |
+------------+----------+
Sum represents the max sum of quantity of stacked/overlapping booked articles for the given time ranges. In the first table article with id=1 has its maximum from booking 1, 3 and 5.
Is there any MySQL solution to obtain this information from a table like this?
Thank you very much!
EDIT: The date intersections are crucial. Let's say booking 5 starts at 2015-03-17 the sum for article_id=1 results 5, because booking 3 and 5 are not overlapping anymore. The sql should automatically consider all possible overlapping possibilities.

My answer is going to seem crazy complicated, perhaps; but it isn't, if one accepts that the use of a calendar table is an excellent MySQL idiom for dealing with date range related issues. I've closely adapted calendar table code from Artful Software's calendar table article. Artful Software's query techniques are a wonderful resource for doing complicated things in MySQL. The calendar table gives you a row per individual date that you are working with, which makes many things much easier.
For the whole thing below, you can go to this sqlfiddle for a place to play around with the code. It'll take a while to load.
First, here is your data:
CREATE TABLE articles
(`id` int, `article_id` int, `quantity` int, `starts_at` datetime, `ends_at` datetime);
INSERT INTO articles
(`id`, `article_id`, `quantity`, `starts_at`, `ends_at`)
VALUES
(1, 1, 1, '2015-03-01 00:00:00', '2015-03-20 00:00:00'),
(2, 1, 2, '2015-03-02 00:00:00', '2015-03-03 00:00:00'),
(3, 1, 3, '2015-03-04 00:00:00', '2015-03-15 00:00:00'),
(4, 1, 2, '2015-03-16 00:00:00', '2015-03-22 00:00:00'),
(5, 1, 2, '2015-03-11 00:00:00', '2015-03-19 00:00:00'),
(6, 2, 2, '2015-03-06 00:00:00', '2015-03-22 00:00:00'),
(7, 2, 3, '2015-03-02 00:00:00', '2015-03-04 00:00:00');
Next, here is the creation of the calendar table--I've created somewhat more date rows than needed (going back to start of year, and forward to start of next year). Ideally you just permanently keep a more massive calendar table on hand, covering a span of dates that will handle anything you could ever need. All the stuff below is going to seem quite lengthy and complex. But if you already have a calendar table lying around, the whole next section is not necessary.
CREATE TABLE calendar ( dt datetime primary key );
/* the views below will be joined and rejoined to themselves to
get the effect creating many rows. V ends up with 10 rows. */
CREATE OR REPLACE VIEW v3 as SELECT 1 n UNION ALL SELECT 1 UNION ALL SELECT 1;
CREATE OR REPLACE VIEW v as SELECT 1 n FROM v3 a, v3 b UNION ALL SELECT 1;
/* Going to limit the calendar table to first of year of min date
and first of year after max date */
SELECT #min := makedate(year(min(starts_at)),1) FROM articles;
SELECT #max := makedate(year(min(ends_at))+1,1) FROM articles;
SET #inc = -1;
/* below we work with #min date + #inc days successively, with #inc:=#inc+1
acting like ++variable, so we start with minus 1.
We insert as many individual date rows as we want by self-joining v,
and using some kind of limit via WHERE to keep the calendar table small
for our example. For n occurrences of v below, you get a max
of 10^n rows in the calendar table. We are using v as row-creation
engine. */
INSERT INTO calendar
SELECT #min + interval #inc:=#inc+1 day as dt
FROM v a, v b, v c, v d # , v e , v f
WHERE #inc < datediff(#max,#min);
Now we are ready to find the stackings. Assuming the above (big assumption, I know), this becomes pretty easy. I'm going to do it through a few views for readability.
/* now create a view that will let us easily view the articles
related to indvidual dates when we query.
Not necessary, just makes things easier to read. */
CREATE OR REPLACE VIEW articles_to_dates as
SELECT c.dt, article_id
FROM articles a
INNER JOIN calendar c on c.dt between (SELECT min(starts_at) FROM articles) and (SELECT max(ends_at) FROM articles)
GROUP BY article_id, c.dt;
--SELECT * FROM articles_to_dates --This query would show the view's result
/* next view is the total amount of articles booked per individual date */
CREATE OR REPLACE VIEW booked_quantities_per_day AS
SELECT a2d.dt,a2d.article_id, SUM(a.quantity) as booked_quantity
FROM articles_to_dates a2d
INNER JOIN articles a on a2d.dt between a.starts_at and a.ends_at and a.article_id = a2d.article_id
GROUP BY a2d.dt, a2d.article_id
ORDER by a2d.article_id, a2d.dt
--SELECT * from booked_quantities_per_day --this query would show the view's result
Finally, here are the desired results:
SELECT article_id, max(booked_quantity) max_stacked
FROM booked_quantities_per_day
GROUP BY article_id;
Results:
article_id max_stacked
1 6
2 3

This should work.
Two groups. First to get distinct list of possible 'quantity'; second - summarise them
SELECT article_id, SUM(sub.quantity) FROM
(SELECT article_id, quantity FROM table GROUP BY article_id, quantity) as sub
GROUP BY article_id

select sum(quantity) from ...
group by article_id
where
... select your date range ...

INNER JOIN not working for me

I'm currently working on a database for my Magic: The Gathering Playgroup which keeps track of decks and more specific which decks win against how many others and so on.
The table "Wins" looks like the following:
PNr (Playernumber which is primary key in the table players)
DNr (Decknumber which is primary key in the table decks)
Date (combined primary key with MNr)
MNr (Matchnumber of the day)
Pl (Amount of Players in the game)
Loc (Location)
Code (containing of all the playing players Shortcuts, e.g. AMT for the Players Alex, Martin and Tobias, see below)
The table Players is pretty easy:
PNr
Pname (Playersname)
SC (Players Shortcut)
Now I wanted to make a Query that provides a table of Expected Winrate (which is 1/4 in a 4 Player game, 1/5 in a 5 player game etc.) and the actual amount of Wins for each player (and later on Expected and actual Winrate but I think I can workthat out on my own once I got this baby to work).
So far I've come up with smth like this:
SELECT a.'Player',a.'ExpectedWinrate',b.'Wins'
FROM(
SELECT
ROUND(((SUM(1/Pl))/Count(*))*100, 1) as 'ExpectedWinrate',
Players.Pname as 'Player'
FROM
Wins, Players
WHERE Code LIKE CONCAT('%', Players.SC, '%')
GROUP BY Players.Pname) a
INNER JOIN
(SELECT
Count(*) as 'Wins',
Players.Pname as 'Players'
FROM Players, Wins
WHERE Players.PNr = Wins.PNr
GROUP BY Players.Pname
ORDER BY Count(*) desc) b ON 'Players' = 'Player';
The problem that I've run into is that I need the Count(*) for two different things in one query so I had to make two independent ones and join them, but I don't know how to "name" them (in this case I tried with "a" and "b") in order to use expressions like a.'Player', a.'ExpectedWinrate', etc.
Can anyone help a MYSQL newb?^^
greetzSP
EDIT: added expample tables...
CREATE TABLE Players
(
PNr int primary key,
Pname varchar(20),
SC varchar(1)
);
INSERT INTO Players
(PNr, Pname, SC)
VALUES
(1, 'Tobias', 'T'),
(2, 'Alex', 'A'),
(3, 'Martin', 'M'),
(4, 'Maria', 'R');
CREATE TABLE Wins
(
PNr int,
DNr int,
Pl int,
Code varchar(10)
);
INSERT INTO Wins
(PNr, DNr, Pl, Code)
VALUES
(1, 13, 3, 'ATM'),
(4, 1, 4, 'RTMA'),
(3, 20, 3, 'RTM');
Wins: (leaving out columns that don't matter in this query)
| PNR | DNR | PL | CODE |
|-----|-----|----|------|
| 1 | 13 | 3 | ATM |
| 4 | 1 | 4 | RTMA |
| 3 | 20 | 3 | RTM |
Players:
| PNR | PNAME | SC |
|-----|--------|----|
| 1 | Tobias | T |
| 2 | Alex | A |
| 3 | Martin | M |
| 4 | Maria | R |

SELECT a.Player ,a.ExpectedWinrate ,b.Wins
FROM(
SELECT
ROUND(((SUM(1/w.Pl))/Count(*))*100, 1) as 'ExpectedWinrate'
,p.Pname as 'Player'
FROM Wins w inner join Players p
on w.Code LIKE CONCAT('%', p.SC, '%')
GROUP BY p.Pname
) a
inner join
(
SELECT
Count(*) as 'Wins'
,p.Pname as 'Players'
FROM Players p inner join Wins w
on p.PNr = w.PNr
GROUP BY p.Pname
--ORDER BY Count(*) desc
) b
ON a.Player = b.Players
I have tested it on SQL Server, try on MySQL

Collaborative filtering in MySQL?

I'm trying to develop a site that recommends items(fx. books) to users based on their preferences. So far, I've read O'Reilly's "Collective Intelligence" and numerous other online articles. They all, however, seem to deal with single instances of recommendation, for example if you like book A then you might like book B.
What I'm trying to do is to create a set of 'preference-nodes' for each user on my site. Let's say a user likes book A,B and C. Then, when they add book D, I don't want the system to recommend other books based solely other users experience with book D. I wan't the system to look up similar 'preference-nodes' and recommend books based on that.
Here's an example of 4 nodes:
User1: 'book A'->'book B'->'book C'
User2: 'book A'->'book B'->'book C'->'book D'
user3: 'book X'->'book Y'->'book C'->'book Z'
user4: 'book W'->'book Q'->'book C'->'book Z'
So a recommendation system, as described in the material I've read, would recommend book Z to User 1, because there are two people who recommends Z in conjuction with liking C (ie. Z weighs more than D), even though a user with a similar 'preference-node', User2, would be more qualified to recommend book D because he has a more similar interest-pattern.
So do any of you have any experience with this sort of thing? Is there some things I should try to read or does there exist any open source systems for this?
Thanks for your time!
Small edit: I think last.fm's algorithm is doing exactly what I my system to do. Using the preference-trees of people to recommmend music more personally to people. Instead of just saying "you might like B because you liked A"

Create a table and insert the test data:
CREATE TABLE `ub` (
`user_id` int(11) NOT NULL,
`book_id` varchar(10) NOT NULL,
PRIMARY KEY (`user_id`,`book_id`),
UNIQUE KEY `book_id` (`book_id`,`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
insert into ub values (1, 'A'), (1, 'B'), (1, 'C');
insert into ub values (2, 'A'), (2, 'B'), (2, 'C'), (2,'D');
insert into ub values (3, 'X'), (3, 'Y'), (3, 'C'), (3,'Z');
insert into ub values (4, 'W'), (4, 'Q'), (4, 'C'), (4,'Z');
Join the test data onto itself by book_id, and create a temporary table to hold each user_id and the number of books it has in common with the target user_id:
create temporary table ub_rank as
select similar.user_id,count(*) rank
from ub target
join ub similar on target.book_id= similar.book_id and target.user_id != similar.user_id
where target.user_id = 1
group by similar.user_id;
select * from ub_rank;
+---------+------+
| user_id | rank |
+---------+------+
| 2 | 3 |
| 3 | 1 |
| 4 | 1 |
+---------+------+
3 rows in set (0.00 sec)
We can see that user_id has 3 in common with user_id 1, but user_id 3 and user_id 4 only have 1 each.
Next, select all the books that the users in the temporary table have that do not match the target user_id's books, and arrange these by rank. Note that the same book might appear in different user's lists, so we sum the rankings for each book so that common books get a higher ranking.
select similar.book_id, sum(ub_rank.rank) total_rank
from ub_rank
join ub similar on ub_rank.user_id = similar.user_id
left join ub target on target.user_id = 1 and target.book_id = similar.book_id
where target.book_id is null
group by similar.book_id
order by total_rank desc;
+---------+------------+
| book_id | total_rank |
+---------+------------+
| D | 3 |
| Z | 2 |
| X | 1 |
| Y | 1 |
| Q | 1 |
| W | 1 |
+---------+------------+
6 rows in set (0.00 sec)
Book Z appeared in two user lists, and so was ranked above X,Y,Q,W which only appeared in one user's list. Book D did best because it appeared in user_id 2's list, which had 3 items in common with target user_id 1.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008