I'm very new to MYSQL, have looked at many answers on this site but can't get the following to work...
Table is "member"
3 fields are "id" (Integer); and 2 date fields "dob" and "expiry"
I need to count the number of records where all are current members, ie
expiry<curdate()
then I need to know the count of records with the following conditions:
year(curdate())-year(dob) <25 as young
year(curdate())-year(dob) >25 and <=50 as Medium
year(curdate())-year(dob) >50 as Older
So I expect to get a single row with many columns and the count of each of these conditions.
Effectively I'm filtering current members for their age grouping.
I've tried a subquery but failed to get that to work.
Thanks
If you really want the end result as you have mentioned, you could use views. It takes a long way to achieve the result. However, here is the way. I created the following table member and inserted data as follows.
CREATE TABLE member (
id int(11) AUTO_INCREMENT PRIMARY KEY,
dob date DEFAULT NULL,
expiry date DEFAULT NULL
);
INSERT INTO member (id, dob, expiry) VALUES
(1, '1980-01-01', '2020-05-05'),
(2, '1982-05-05', '2020-01-01'),
(3, '1983-05-05', '2020-01-01'),
(4, '1981-05-05', '2020-01-01'),
(5, '1994-05-05', '2020-01-01'),
(6, '1992-05-05', '2020-01-01'),
(7, '1960-05-05', '2020-01-01'),
(8, '1958-05-05', '2020-01-01'),
(9, '1958-07-07', '2020-05-05');
Following is the member table with data.
id | dob | expiry
--------------------------------
1 | 1980-01-01 | 2020-05-05
2 | 1982-05-05 | 2020-01-01
3 | 1983-05-05 | 2020-01-01
4 | 1981-05-05 | 2020-01-01
5 | 1994-05-05 | 2020-01-01
6 | 1992-05-05 | 2020-01-01
7 | 1960-05-05 | 2020-01-01
8 | 1958-05-05 | 2020-01-01
9 | 1958-07-07 | 2020-05-05
Then I created a separate view for all the current employees named as current_members as follows.
CREATE VIEW current_members AS (SELECT * FROM member WHERE TIMESTAMPDIFF(YEAR, CAST(CURRENT_TIMESTAMP AS DATE), member.expiry) >= 0);
Then querying from that view, I created 3 separate views containing counts for each age ranges of young, middle and old as follows.
CREATE VIEW young AS (SELECT COUNT(*) as Young FROM (SELECT TIMESTAMPDIFF(YEAR, current_members.dob, CAST(CURRENT_TIMESTAMP AS DATE)) AS age FROM current_members HAVING age <= 25) yng);
CREATE VIEW middle AS (SELECT COUNT(*) as Middle FROM (SELECT TIMESTAMPDIFF(YEAR, current_members.dob, CAST(CURRENT_TIMESTAMP AS DATE)) AS age FROM current_members HAVING age BETWEEN 25 AND 50) mid);
CREATE VIEW old AS (SELECT COUNT(*) as Old FROM (SELECT TIMESTAMPDIFF(YEAR, current_members.dob, CAST(CURRENT_TIMESTAMP AS DATE)) AS age FROM current_members HAVING age >= 50) old);
Finally, the three views were cross joined in order to get the counts of each age range into a single row of one final table as follows.
SELECT * FROM young, middle, old;
This will give you the following result.
Young | Middle | Old
----------------------
2 | 4 | 3
SUGGESTION : FOR THE ABOVE TEDIOUS TIME DIFFERENCE CALCULATIONS, YOU COULD WRITE YOUR OWN STORED PROCEDURE TO SIMPLIFY THE CODE
Related
Let's say I have a table TRANSACTIONS:
desc customer_transactions;
+------------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| transactionID | varchar(128) | YES | | NULL | |
| customerID | varchar(128) | YES | | NULL | |
| amountAuthorized | DECIMAL(5,2) | YES | | NULL | |
| createdDatetime | datetime | YES | | NULL | |
+------------------------------+--------------+------+-----+---------+----------------+
This table has records of credit card transactions for a SAAS business for the last 5 years. The business has a typical monthly subscription model, where customers automatically charged based on their plan.
I need to find the top customers that are responsible for 80% of all revenue (per time period). The SAAS business is very uneven, because some customers pay 10/month, others may pay in thousands per month.
I will add a "time period" filter later, just need help with aggregation.
I want to generate a report where I only select the customers that generated 80% of revenue in this format:
+------------+-------+
| customerID | Total |
+------------+-------+
Not sure why this question was "on hold". I just need help writing a query and do not have enough experience with SQL. Basically, the title of the question states what is needed here:
I need to list customers and their corresponding totals, however, only need to select those customers that make up 80% of total revenue. The report needs to aggregate a total per customer.
Using MariaDB version 10.3.9
This is the kind of thing you need to use window functions for.
WITH
-- define some sample data,
-- where the sum total of amountAuthorized is 10,000
customer_transactions( `id`, transactionID, customerID,
amountAuthorized, createdDatetime) AS
(
SELECT 1, 1, 1, 5000, '2018-08-01'
UNION ALL SELECT 2, 2, 2, 2000, '2018-08-01'
UNION ALL SELECT 3, 3, 3, 1000, '2018-08-01'
UNION ALL SELECT 4, 4, 4, 1000, '2018-08-01'
UNION ALL SELECT 5, 5, 5, 1000, '2018-08-01'
)
-- a query that gives us the running total, sorted to give us the biggest customers first.
-- note that the additional sorts affect what customers might be returned.
,running_totals AS
(
SELECT *, SUM(amountAuthorized) OVER (ORDER BY amountAuthorized DESC, createdDatetime DESC, `id`) AS runningTotal
FROM customer_transactions
)
SELECT *
FROM running_totals
WHERE runningTotal <= ( SELECT 0.8 * SUM(amountAuthorized)
FROM customer_transactions)
Note that this takes into account (no pun intended) all data in the table. When you want to only look at a specific time period, you might want to create an intermediate CTE that filters out the dates you want.
You will find that surprisingly close 20% of the customers account for the 80%. See the 80/20 rule .
But, if you don't want to go that direction, you have 2 options:
Switch to MySQL 8.0 or MariaDB 10.1 in order to use 'windowing' functions; or
Use #variables to produce a running total, then (in an outer query) grab the desired rows.
Since you are using MariaDB 10.3.9, the windowing seems to be the way to go. But first, you need a separate query (or derived table) that computes the total revenue so you can get 80% of it.
Suggest
SELECT #revenue80 := 0.8 * SUM(amountAuthorized)
FROM customer_transactions
Then use #revenue80 inside the WHERE that Zack suggests.
I see that each amount can be no more than 999.99. Really? Is this a coffee shop?
Use the following:
SELECT
ct1.customerID,
SUM(ct1.amountAuthorized) as Total,
100 * (SUM(ct1.amountAuthorized) / ct3.total_revenue) as percent_revenue
FROM
customer_transactions ct1
CROSS JOIN (SELECT SUM(amountAuthorized) AS total_revenue
FROM customer_transactions ct2) AS ct3
GROUP BY
ct1.customerID
HAVING percent_revenue >= 80
I have a database full of all sorts of records regarding baseball teams and their players. I need to write a query that shows the names of any player who has played for the teamID "MON" three consecutive years. I've already written a query that gives me the table below, showing the years they played for that team.
| nameFirst | nameLast| Year |
+-----------+---------+-------+
| Santo | Alcala | 1977 |
| Santo | Alcala | 1978 |
| Santo | Alcala | 1979 |
| Scott | Aldred | 1993 |
I'm too lazy to enter any more records in the table, but this should be plenty to understand the situation. The actual table in my DB has thousands of records. So the query I need would return one record for Santo Alcala since he played three consecutive years for the MON team. The above table only shows players who played for MON, I already wrote a query that excludes all players who played for teams other than MON.
The desired output of the query would be a record such as:
| nameFirst | nameLast|
+--------------+---------+
| Santo | Alcala |
If a player played for more than 3 consecutive years on the team, they would also be shown in the results.
Are you looking for something like the below?
Schema
CREATE TABLE PLAYER (
ID INT,
FIRST VARCHAR(25),
LAST VARCHAR(25),
YEAR INT
);
INSERTS
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (1, "Santo", "Alcala", 1977);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (2, "Santo", "Alcala", 1978);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (3, "Santo", "Alcala", 1979);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (4, "Santo", "Alcala", 1980);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (5, "Santo", "Aldred", 1993);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (6, "Santo", "Aldred", 1994);
INSERT INTO PLAYER (ID, FIRST, LAST, YEAR) VALUES (7, "Santo", "Royal", 1994);
Query
select DISTINCT(FIRST), LAST from player where ID
IN (select p1.ID from player p1 inner join player p2
on p1.year = p2.year+1 and p1.first = p2.first and p2.last = p1.last);
OUTPUT
FIRST LAST
Santo Alcala
Santo Aldred
SQL FIDDLE
http://sqlfiddle.com/#!9/b5c1c4/1
I have a table that looks like this:
CustomerID | ContactTime | AttemptResult
-----------+-----------------+-----------------
1 | 1/1/2016 5:00 | Record Started
1 | 1/1/2016 6:00 | Appointment
2 | 1/2/2016 5:00 | Record Started
1 | 1/3/2016 6:00 | Sold
2 | 1/2/2016 5:00 | Sold
3 | 1/4/2016 5:00 | Record Started
3 | 1/4/2016 6:00 | Sold
From
create table #temp1
(
CustomerID int,
ContactTime datetime,
Result nvarchar(50)
)
insert into #temp1 values (1, '1/1/2016 5:00', 'Record Started')
insert into #temp1 values (1, '1/1/2016 6:00', 'Appointment')
insert into #temp1 values (2, '1/2/2016 5:00', 'Record Started')
insert into #temp1 values (1, '1/3/2016 6:00', 'Sold')
insert into #temp1 values (2, '1/2/2016 5:00', 'Sold')
insert into #temp1 values (3, '1/4/2016 5:00', 'Record Started')
insert into #temp1 values (3, '1/4/2016 6:00 ', 'Sold')
How can I query this in a way that gets all combinations in order of AttemptResults ? So something like:
CustID | Sequence
-------+--------------------------------------
1 | Record Started -> Appointment -> Sold
2 | Record Started -> Sold
3 | Record Started -> Sold
I'm not even sure where to start...
If this is your complete dataset, I can help you. Otherwise I would need to see more. Use something called a Window Function. Below esentially indexes or keeps track of how many entries there are for each CustomerID.
Select *, row_number() over (partition CustomerID group by ContactTime) as Combo
into #temp
from table
Then count how many combos of 2 happen (Record Started->Sold), combos of 3 (Record Started -> Appointment -> Sold )
Select CustomerID, max(Combo) as MaxCombo
into #temp1
from #temp
group by CustomerId
Select MaxCombo, count(*)
from #temp1
group by MaxCombo
You could also use Common Table Expressions instead of these temp tables but I didnt want to add too much confusion.
I have the following table which represents bookings of articles:
+---+------------+----------+-------------+-------------+
|id | article_id | quantity | starts_at | ends_at |
+---+------------+----------+-------------+-------------+
| 1 | 1 | 1 | 2015-03-01 | 2015-03-20 |
| 2 | 1 | 2 | 2015-03-02 | 2015-03-03 |
| 3 | 1 | 3 | 2015-03-04 | 2015-03-15 |
| 4 | 1 | 2 | 2015-03-16 | 2015-03-22 |
| 5 | 1 | 2 | 2015-03-11 | 2015-03-19 |
| 6 | 2 | 2 | 2015-03-06 | 2015-03-22 |
| 7 | 2 | 3 | 2015-03-02 | 2015-03-04 |
+---+------------+----------+-------------+-------------+
From this table I want to extract the following information:
+------------+----------+
| article_id | sum |
+------------+----------+
| 1 | 6 |
| 2 | 3 |
+------------+----------+
Sum represents the max sum of quantity of stacked/overlapping booked articles for the given time ranges. In the first table article with id=1 has its maximum from booking 1, 3 and 5.
Is there any MySQL solution to obtain this information from a table like this?
Thank you very much!
EDIT: The date intersections are crucial. Let's say booking 5 starts at 2015-03-17 the sum for article_id=1 results 5, because booking 3 and 5 are not overlapping anymore. The sql should automatically consider all possible overlapping possibilities.
My answer is going to seem crazy complicated, perhaps; but it isn't, if one accepts that the use of a calendar table is an excellent MySQL idiom for dealing with date range related issues. I've closely adapted calendar table code from Artful Software's calendar table article. Artful Software's query techniques are a wonderful resource for doing complicated things in MySQL. The calendar table gives you a row per individual date that you are working with, which makes many things much easier.
For the whole thing below, you can go to this sqlfiddle for a place to play around with the code. It'll take a while to load.
First, here is your data:
CREATE TABLE articles
(`id` int, `article_id` int, `quantity` int, `starts_at` datetime, `ends_at` datetime);
INSERT INTO articles
(`id`, `article_id`, `quantity`, `starts_at`, `ends_at`)
VALUES
(1, 1, 1, '2015-03-01 00:00:00', '2015-03-20 00:00:00'),
(2, 1, 2, '2015-03-02 00:00:00', '2015-03-03 00:00:00'),
(3, 1, 3, '2015-03-04 00:00:00', '2015-03-15 00:00:00'),
(4, 1, 2, '2015-03-16 00:00:00', '2015-03-22 00:00:00'),
(5, 1, 2, '2015-03-11 00:00:00', '2015-03-19 00:00:00'),
(6, 2, 2, '2015-03-06 00:00:00', '2015-03-22 00:00:00'),
(7, 2, 3, '2015-03-02 00:00:00', '2015-03-04 00:00:00');
Next, here is the creation of the calendar table--I've created somewhat more date rows than needed (going back to start of year, and forward to start of next year). Ideally you just permanently keep a more massive calendar table on hand, covering a span of dates that will handle anything you could ever need. All the stuff below is going to seem quite lengthy and complex. But if you already have a calendar table lying around, the whole next section is not necessary.
CREATE TABLE calendar ( dt datetime primary key );
/* the views below will be joined and rejoined to themselves to
get the effect creating many rows. V ends up with 10 rows. */
CREATE OR REPLACE VIEW v3 as SELECT 1 n UNION ALL SELECT 1 UNION ALL SELECT 1;
CREATE OR REPLACE VIEW v as SELECT 1 n FROM v3 a, v3 b UNION ALL SELECT 1;
/* Going to limit the calendar table to first of year of min date
and first of year after max date */
SELECT #min := makedate(year(min(starts_at)),1) FROM articles;
SELECT #max := makedate(year(min(ends_at))+1,1) FROM articles;
SET #inc = -1;
/* below we work with #min date + #inc days successively, with #inc:=#inc+1
acting like ++variable, so we start with minus 1.
We insert as many individual date rows as we want by self-joining v,
and using some kind of limit via WHERE to keep the calendar table small
for our example. For n occurrences of v below, you get a max
of 10^n rows in the calendar table. We are using v as row-creation
engine. */
INSERT INTO calendar
SELECT #min + interval #inc:=#inc+1 day as dt
FROM v a, v b, v c, v d # , v e , v f
WHERE #inc < datediff(#max,#min);
Now we are ready to find the stackings. Assuming the above (big assumption, I know), this becomes pretty easy. I'm going to do it through a few views for readability.
/* now create a view that will let us easily view the articles
related to indvidual dates when we query.
Not necessary, just makes things easier to read. */
CREATE OR REPLACE VIEW articles_to_dates as
SELECT c.dt, article_id
FROM articles a
INNER JOIN calendar c on c.dt between (SELECT min(starts_at) FROM articles) and (SELECT max(ends_at) FROM articles)
GROUP BY article_id, c.dt;
--SELECT * FROM articles_to_dates --This query would show the view's result
/* next view is the total amount of articles booked per individual date */
CREATE OR REPLACE VIEW booked_quantities_per_day AS
SELECT a2d.dt,a2d.article_id, SUM(a.quantity) as booked_quantity
FROM articles_to_dates a2d
INNER JOIN articles a on a2d.dt between a.starts_at and a.ends_at and a.article_id = a2d.article_id
GROUP BY a2d.dt, a2d.article_id
ORDER by a2d.article_id, a2d.dt
--SELECT * from booked_quantities_per_day --this query would show the view's result
Finally, here are the desired results:
SELECT article_id, max(booked_quantity) max_stacked
FROM booked_quantities_per_day
GROUP BY article_id;
Results:
article_id max_stacked
1 6
2 3
This should work.
Two groups. First to get distinct list of possible 'quantity'; second - summarise them
SELECT article_id, SUM(sub.quantity) FROM
(SELECT article_id, quantity FROM table GROUP BY article_id, quantity) as sub
GROUP BY article_id
select sum(quantity) from ...
group by article_id
where
... select your date range ...
So I have an MySQL table structured like this:
CREATE TABLE `spenttime` {
`id` int(11) NOT NULL AUTO_INCREMENT,
`userid` int(11) NOT NULL,
`serverid` int(11) NOT NULL,
`time` int(11) NOT NULL,
`day` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `dbid_sid_day` (`userid`,`serverid`,`day`)
}
Where I'm storing time spent on my game servers every day for each registered player. time is the amount of time spent, in seconds, day is an unix timestamp of each day (beginning of the day). I want to create an View on my database that will show for each user time spent on server every week, but with an column displaying rank of that time, independent for each server on each week. For example data (for clarify i will use date format Y-M-D instead of unix timestamp for day column on this example):
INSERT INTO `spenttime` (`userid`, `serverid`, `time`, `day`) VALUES
(1, 1, 200, '2013-04-01'),
(1, 1, 150, '2013-04-02'),
(2, 1, 100, '2013-04-02'),
(3, 1, 500, '2013-04-04'),
(2, 2, 400, '2013-04-04'),
(1, 1, 300, '2013-04-08'),
(3, 1, 200, '2013-04-08');
For that data in viev named spenttime_week should appear:
+--------+----------+--------+------------+------+
| userid | serverid | time | yearweek | rank |
+--------+----------+--------+------------+------+
| 1 | 1 | 350 | '2013-W14' | 2 |
| 2 | 1 | 100 | '2013-W14' | 3 |
| 3 | 1 | 500 | '2013-W14' | 1 |
| 2 | 2 | 400 | '2013-W14' | 1 |
| 1 | 1 | 300 | '2013-W15' | 1 |
| 3 | 1 | 200 | '2013-W15' | 2 |
+--------+----------+--------+------------+------+
I know how to generate view wihout rank, i have only troubles with rank column...
How can I make that happen?
//edit
Additionaly, this column MUST appear in viev, I cannot generate It in select from that view, because app where I will use it don't allow that...
First you need to create a first VIEW that sums the spent time for every user on the same week:
CREATE VIEW total_spent_time AS
SELECT userid,
serverid,
sum(time) AS total_time,
yearweek(day, 3) as week
FROM spenttime
GROUP BY userid, serverid, week;
then you can create your view as this:
CREATE VIEW spenttime_week AS
SELECT
s1.userid,
s1.serverid,
s1.total_time,
s1.week,
count(s2.total_time)+1 AS rank
FROM
total_spent_time s1 LEFT JOIN total_spent_time s2
ON s1.serverid=s2.serverid
AND s1.userid!=s2.userid
AND s1.week = s2.week
AND s1.total_time<=s2.total_time
GROUP BY
s1.userid,
s1.serverid,
s1.total_time,
s1.week
ORDER BY
s1.week, s1.serverid, s1.userid
Please see a fiddle here.
Lots of ways you could get the yearweek column, a quick lazy solution to that for clarity (because I doubt you're struggling with that). But here's how you can get the rank.
Use a self join to get dataset including rows with higher time value than current row, then count the rows with higher value:
This is much easier in MSSQL, which is where I live 99% of the time, and where you can just use the RANK() function. I hadn't realised until today there wasn't an equivalent in mysql. Fun to work out how to get the same result without MS's helping hand.
Prep stuff for context:
CREATE TABLE spenttime (userid int, serverid int, [time] int, [day] DATETIME)
CREATE TABLE weeklookup (weekname VARCHAR(10), weekstart DATETIME, weekend DATETIME)
INSERT INTO spenttime (userid, serverid, [time], [day]) VALUES
(1, 1, 200, '2013-apr-01'),
(1, 1, 150, '2013-apr-02'),
(2, 1, 100, '2013-apr-02'),
(3, 1, 500, '2013-apr-04'),
(2, 2, 400, '2013-apr-04'),
(1, 1, 300, '2013-apr-08'),
(3, 1, 200, '2013-apr-08');
INSERT INTO weeklookup(weekname, weekstart, weekend) VALUES
('2013-w14', '01/apr/2013', '08/apr/2013'),
('2013-w15', '08/apr/2013', '15/apr/2013')
GO
CREATE VIEW weekgroup AS
SELECT a.userid ,
a.serverid ,
a.[time] ,
w1.weekname
FROM spenttime a
INNER JOIN weeklookup w1 ON [day] >= w1.weekstart
AND [day] < w1.weekend
GO
Select statement for the view:
SELECT wv1.userid ,
wv1.serverid ,
wv1.[time] ,
wv1.weekname AS yearweek ,
COUNT(wv2.[time]) + 1 AS rank
FROM weekgroup wv1
LEFT JOIN weekgroup wv2 ON wv1.[time] < wv2.[time]
AND wv1.weekname = wv2.weekname
AND wv1.serverid = wv2.serverid
GROUP BY wv1.userid ,
wv1.serverid ,
wv1.[time] ,
wv1.weekname
ORDER BY wv1.weekname ,
wv1.[time] DESC
If you want to store the rank, you would use an insert trigger. The insert trigger would calculate the rank, as something like:
select count(*)
from spenttime_week w
where w.yearweek = new.yearweek and time >= new.time
However, I would not recommend this, because you then have to create an update trigger as well, and modify rank values that are already inserted.
Instead, access the table using SQL like:
select w.*,
(select count(*) from spenttime_week w2 where w2.yearweek = w.yearweek and w2.time >= w.time
) as rank
from spenttime_week w
This SQL may vary, depending on how you want to handle ties in the data. For performance reasons, you should have an index on at least yearweek, and probably on yearweek, time.