I have a table in the form:
id
comp
employment_year
1
ShoesCo
2000
1
FeetOrg
2006
1
SizeEight
2012
2
ShoesCo
2001
2
SizeEight
2004
2
FeetOrg
2007
3
SizeEight
2001
3
ShoesCo
2004
3
FeetOrg
2007
I want to count (get the total) number of people who worked at ShoesCo prior to (employment_date) working at SizeEight. The id is the uniqueid for each employee. I am thinking of self-join but have limited experience with SQL.
The answer should be 2 for this example.
If the data have no duplicates by (id,comp) then
SELECT COUNT(DISTINCT id)
FROM table t1
JOIN table t2 USING (id)
WHERE t1.comp = 'ShoesCo'
AND t2.comp = 'SizeEight'
AND t1.employment_year < t2.employment_year
Re-post due to bad data set and bad formatting. I am trying to divide data from two separate tables that have ambiguous column names.
I am newer to SQL, I know it should be simple, however I just can not figure it out. So far I have tried to rename columns, alias columns, union the table, and select multiple data sets.
I keep hitting roadblocks.
I am trying to measure growth or decline week over week. Ideally I want to take the total sales for Plates and do the following equation: (75/100-1) which would equal a -25% decline from last week.
What would be the best way to go about this?
The two example tables are below
LastWeekData
Product Day Month TotalSales
Plates 7 3 $100
Spoons 7 3 $150
Forks 7 3 $120
CurrentData
Product Day Month TotalSales
Plates 14 3 $75
Spoons 14 3 $100
Forks 14 3 $115
You can use table alias to differentiate the table columns that you want to display. See demo here: http://sqlfiddle.com/#!9/0b0d81/29
select cur.Product,
cur.Day,
cur.Month,
cur.TotalSales as currweek_TotalSales,
pre.TotalSales as lastweek_TotalSales,
round((cur.TotalSales/pre.TotalSales-1)*100) as percent_change
from CurrentData as cur
inner join LastWeekData as pre
on pre.product=cur.product
where datediff(str_to_date(concat_ws('-','0001',cur.month,cur.day),'%Y-%m-%d'),
str_to_date(concat_ws('-','0001',pre.month,pre.day),'%Y-%m-%d'))
= 7
Result:
Product Day Month currweek_TotalSales lastweek_TotalSales percent_change
Plates 14 3 75 100 -25
Spoons 14 3 100 150 -33
Forks 14 3 115 120 -4
I am creating a library database and have four tables as follows;
I have been researching ways to work out the frequency in MySQL but after such as long time and misunderstanding I've decided to try get an example of how to work out the frequency on tables that I'll understand. Below are the four tables I am currently using.
I am looking to workout the loan frequency of every book that has been loaned 2 or more times. By doing this I am able to see how working out frequency would work when selecting specific values instead of all values.
From looking at my tables I would have to select the 'code' from the loan table, select all values that occur twice or more and then workout the frequency of the occurrence.
From my research I would decide to use an INNER JOIN to connect the tables, COUNT to count the number of values, GROUP BY to group the values and HAVING as WHERE may not be used. I am having trouble writing the query and continuously stumble upon errors. Could anyone use the example above to explain how they worked out the frequency of each book loaned two times or more? Thanks in advance
Table 1 - book
isbn title author
111-2-33-444444-5 Pro JavaFX Dave Smith
222-3-44-555555-6 Oracle Systems Kate Roberts
333-4-55-666666-7 Expert jQuery Mike Smith
Table 2 - copy
code isbn duration
1011 111-2-33-444444-5 21
1012 111-2-33-444444-5 14
1013 111-2-33-444444-5 7
2011 222-3-44-555555-6 21
3011 333-4-55-666666-7 7
3012 333-4-55-666666-7 14
Table 3 - student
no name school embargo
2001 Mike CMP No
2002 Andy CMP Yes
2003 Sarah ENG No
2004 Karen ENG Yes
2005 Lucy BUE No
Table 4 - loan
code no taken due return
1011 2002 2015.01.10 2015.01.31 2015.01.31
1011 2002 2015.02.05 2015.02.26 2015.02.23
1011 2003 2015.05.10 2015.05.31
1013 2003 2014.03.02 2014.03.16 2014.03.10
1013 2002 2014.08.02 2014.08.16 2014.08.16
2011 2004 2013.02.01 2013.02.22 2013.02.20
3011 2002 2015.07.03 2015.07.10
3011 2005 2014.10.10 2014.10.17 2014.10.20
You didn't specify the type of frequency, but this query calculates the number of loans per week for each book that was loaned more than once in 2014:
select b.isbn
, b.title
, count(*) / 52 -- loans/week
from loan l
join copy c
on c.code = l.code
join book b
on b.isbn = c.isbn
where '2014-01-01' <= taken and taken < '2015-01-01'
group by
b.isbn
, b.title
having count(*) > 1 -- loaned more than once
I am trying to execute a SQL query such that the following table:
id in_year out_year
------- ---------- -------------
1 2001 2002
2 2002 2002
3 2004 2007
can be queried such that I get all the years within that range mapped to the id. For instance, I would like to get:
id year
--------- ---------
1 2001
1 2002
2 2002
3 2004
3 2005
3 2006
3 2007
Specifically, lets say the table represents a shop with elements and their arrival to shop, and sell dates. The query would return all the element ids mapped to the year where they were in the shop.
You can construct a temp table with the years in your range of data i.e.
CREATE TABLE tmp_years (
yr YEAR NOT NULL,
PRIMARY KEY (yr)
) ENGINE=INNODB;
INSERT INTO tmp_years (yr) VALUES (2000), (2001), (2002), (2003), (2004), (2005), (2006), (2007);
and then do a JOIN:
SELECT w.id, y.yr FROM wesams_table w
INNER JOIN tmp_years y ON (y.yr >= w.in_year AND y.yr <= w.out_year);
The tidiest solution would be to create a UDF to return the range of years and use CROSS APPLY.
Performance should be rather good as the UDF will be deterministic
Edit: Sorry, I don't think this applies to MySQL.
I have a MySQL query:
SELECT px.player, px.pos, px.year, px.age, px.gp, px.goals, px.assists
, 1000 - ABS(p1.gp - px.gp) - ABS(p1.goals - px.goals) - ABS(p1.assists - px.assists) sim
FROM hockey p1
JOIN hockey px
ON px.player <> p1.player
WHERE p1.player = 'John Smith'
AND p1.year = 2010
HAVING sim >= 900
ORDER BY sim DESC
This gets me a table of results, something like this:
player pos year age gp goals assists sim
Player1 LW 2002 25 75 29 32 961
Player2 LW 2000 27 82 29 27 956
Player3 RW 2000 27 78 29 33 955
Player4 LW 2009 26 82 30 30 940
Player5 RW 2001 25 79 33 24 938
Player6 LW 2008 25 82 23 24 936
Player7 LW 2006 27 79 26 33 932
Instead, I would like it to do two things. Average the data and add a player count, so I get something like:
players age gp goals assists sim
7 26 79 28 29 945
I tried avg(px.age), avg(px.gp), avg(px.goals)...etc but I am running into errors with my "sim" formula.
Second issue is that underneath that, I would like to have the average of the data for the FOLLOWING year. In other words data from Player1 in 2003, data from Player2 in 2001, etc.
I am stuck as to HOW to get the data to average AND to get it for the following year.
Can anyone help me with either or both of these issues?
To get a single subtotal of counts and averages, just wrap your original query AS the inner select... something like... (pq = "PreQuery" select result)
Select
max( "Tot Players" ) Players,
max( "->" ) position,
count(*) Year,
avg( pq.age ) AvgAge,
avg( pq.gp ) AvgGP,
avg( pq.goals ) AvgGoals,
avg( pq.assists ) AvgAssists,
avg( pq.sim ) AvgSim
from
( SELECT
px.player,
px.pos,
px.year,
px.age,
px.gp,
px.goals,
px.assists,
1000 - ABS(p1.gp - px.gp)
- ABS(p1.goals - px.goals)
- ABS(p1.assists - px.assists) sim
FROM
hockey p1
JOIN hockey px ON px.player <> p1.player
WHERE
p1.player = 'John Smith'
AND p1.year = 2010
HAVING
sim >= 900
ORDER BY
sim DESC ) pq
If your original query worked, this should get you your overall averages. However, with the INNER query with a having and order, might cause a problem. You might need to kill the order by since it really makes no difference in the outer most query. As for the HAVING clause in the INNER query, might need to be moved to a WHERE pq.sim >= 900 in the OUTER SQL-Select.
Additionally, if you wanted the results of all players first, THEN the total, take your original query and merge it with this one... As you'll see, to keep the columns in synch with BOTH queries, I've put a bogus for player and position so it won't crash on mismatched unions... Notice my COUNT column actually would correspond with the YEAR column of the ORIGINAL query.
For the prior year... As Rob mentioned, you would just do a UNION of the two queries just showing the respective year you were qualifying for in each UNION...
EDIT --- CLARIFICATION for 2nd YEAR....
Per your subsequent comment clarification, you would have to get the basis as the basis of the year +1... if you then want the overall averages again, those would be wrapped to an outer max / avg, etc... But I think THIS is what you want for the subsequent year per player
SELECT
PrimaryQry.PrimaryPlayer,
PrimaryQry.PrimaryPos,
PrimaryQry.PrimaryYear,
PrimaryQry.PrimaryAge,
PrimaryQry.PrimaryGP,
PrimaryQry.PrimaryGoals,
PrimaryQry.PrimaryAssists,
PrimaryQry.player,
PrimaryQry.pos,
PrimaryQry.year,
PrimaryQry.age,
PrimaryQry.gp,
PrimaryQry.goals,
PrimaryQry.assists,
PrimaryQry.sim,
p2.pos PrimaryPos2,
p2.year PrimaryYear2,
p2.age PrimaryAge2,
p2.gp PrimaryGP2,
p2.goals PrimaryGoals2,
p2.assists PrimaryAssists2,
px2.player player2,
px2.pos pos2,
px2.year year2,
px2.age age2,
px2.gp gp2,
px2.goals goals2,
px2.assists assists2,
1000 - ABS(p2.gp - px2.gp)
- ABS(p2.goals - px2.goals)
- ABS(p2.assists - px2.assists) sim2
FROM
( SELECT
p1.player PrimaryPlayer,
p1.pos PrimaryPos,
p1.year PrimaryYear,
p1.age PrimaryAge,
p1.gp PrimaryGP,
p1.goals PrimaryGoals,
p1.assists PrimaryAssists,
px.player,
px.pos,
px.year,
px.age,
px.gp,
px.goals,
px.assists,
1000 - ABS(p1.gp - px.gp)
- ABS(p1.goals - px.goals)
- ABS(p1.assists - px.assists) sim
FROM
hockey p1
JOIN hockey px
ON p1.player <> px.player
WHERE
p1.player = 'John Smith'
AND p1.year = 2010
HAVING
sim >= 900 ) PrimaryQry
JOIN hockey p2
ON PrimaryQry.PrimaryPlayer = p2.player
AND PrimaryQry.PrimaryYear +1 = p2.year
JOIN hockey px2
ON PrimaryQry.Player = px2.Player
AND PrimaryQry.Year +1 = px2.year
If you follow the logic here, you already know the inner query is returning about 10 other players. So, I am keeping the stats of the first person basis IN that query too. THEN, I am joining that result set back to the hockey table TWICE... The join is primary player joined to the first for his/her year +1, the SECOND join works specifically to the one person that qualified against the primary player. The final column results get the entire first year qualifier with the second qualifier, such as
So, it will all be on one row consecutively of
John Smith 2010 Compare Person 1 YearA John Smith 2011 Compare Person 1 YearA+1
John Smith 2010 Compare Person 2 YearB John Smith 2011 Compare Person 2 YearB+1
John Smith 2010 Compare Person 3 YearC John Smith 2011 Compare Person 3 YearC+1
What query are you using to get the averages?
Just applying "AVG" to your expression for 'sim' should work in mysql. e.g.
AVG(1000 - ABS(p1.gp - px.gp) - ABS(p1.goals - px.goals) - ABS(p1.assists - px.assists)) sim
To aggregate over different years, I think there is no alternative to using a subselect or union.
Reference:
http://dev.mysql.com/doc/refman/5.0/en/subqueries.html
http://dev.mysql.com/doc/refman/5.0/en/union.html
Something like:
(ORIGINAL AVG QUERY)
UNION ALL
(ORIGINAL AVG QUERY WITH NEW YEAR)
should do the trick.
(Note that your original query selects data from every year to compare it to the data for John Smith in 2010, which may not be what you want.)