Thanks in advance for any assistance. I am new to SQL and have looked at several related threads on this site, and numerous other sites on Google, but have not been able to figure out what I am doing wrong. I have looked at sub-selects, various JOIN options, and keeping bumping into the wrong solution/result.
I have two tables that I am trying to do a query on.
Table:Doctors
idDoctors
PracticeID
FirstName
LastName
Table: Vendor Sales
Id
ProductSales
SalesCommission
DoctorFirstName
DoctorLastName
Here is the Query I am struggling with:
SELECT t1.PracticeID
, SUM( t2.ProductSales ) AS Total_Sales
, COUNT( t1.LastName ) AS Doctor_Count
, COUNT( t1.LastName ) *150 AS Dues
, SUM( t2.ProductSales * t2.SalesCommission ) AS Credit
FROM Doctors AS t1
JOIN VendorSales AS t2 ON t1.Lastname = t2.DoctorLastName
GROUP BY t1.PracticeID
LIMIT 0 , 30
The objective of the Query is to calculate net dues owed by a Practice. I am not yet attempting to calculate the net amount, just trying to get the initial calculations correct.
Result (limited to one result for this example)
PracticeID Total_Sales Doctor_Count Dues Credit
Practice A 16583.04 4 600 304.07360
This is what the result should be:
PracticeID Total_Sales Doctor_Count Dues Credit
Practice A 16583.04 3 450 304.07360
The problem is that Total Sales sums the aggregate sales transactions (in this case 4 sales entries totaling 16584.04). Each of the 4 sales has an associated commission rate. The Credit amount is the total (sum) of the commission.
The sales and credit numbers are accurate. But the Doctor count should be 3 (number of Doctors in the practice). Dues should be $450 (150x3). But as you can see it is multiplying by 4 instead of 3.
What do I need to change in the query to get the proper calculations (Doctors and dues multiplied by 3 instead of 4? Or should I be doing this differently? Thanks again.
There are various odd things about your schema, and you have not provided the sample data to justify your asserted values.
The first oddity is that you have both first and last name for the doctor in the Doctors table, and in the Vendor Sales table - yet you join only on the last name. Next, you have an ID column, it seems, in the Doctors table, yet you do not use that in the Vendor Sales table for the joining column.
It is not clear whether there is one entry in the Vendor Sales table per doctor, or whether there can be several. Given the counting issues you describe, we must assume there can be several entries per doctor in the Vendor Sales table. It also isn't clear where the vendor is identified, but we have to assume that is not germane to the problem.
So, one set of data you need is the number of doctors per practice, and the dues (which is 150 currency units per doctor). Let's deal with that first:
SELECT PracticeID, COUNT(*) AS NumDoctors, COUNT(*) * 150 AS Dues
FROM Doctors
GROUP BY PracticeID
Then we need the total sales per practice, and the credit too:
SELECT t1.PracticeID, SUM(t2.ProductSales) AS Total_Sales,
SUM(t2.ProductSales * t2.SalesCommission) AS Credit
FROM Doctors AS t1
JOIN VendorSales AS t2 ON t1.Lastname = t2.DoctorLastName
GROUP BY t1.PracticeID
These two partial answers need to be joined on the PracticeID to produce your final result:
SELECT r1.PracticeID, r1.NumDoctors, r1.Dues,
r2.Total_Sales, r2. Credit
FROM (SELECT PracticeID, COUNT(*) AS NumDoctors, COUNT(*) * 150 AS Dues
FROM Doctors
GROUP BY PracticeID) AS r1
JOIN (SELECT t1.PracticeID, SUM(t2.ProductSales) AS Total_Sales,
SUM(t2.ProductSales * t2.SalesCommission) AS Credit
FROM Doctors AS t1
JOIN VendorSales AS t2 ON t1.Lastname = t2.DoctorLastName
GROUP BY t1.PracticeID) AS r2
ON r1.PracticeID = r2.PracticeID;
That should get you the result you seek, I believe. But it is untested SQL - not least because you didn't give us appropriate sample data to work with.
Related
I'm trying to implement a query which give me the sum of most profitable room in each hotel(25 hotels)
Below is my query:
SELECT hotels.hotel_id,rooms.room_id,hotel_name,room_number,sum(rooms.room_price) AS profit,COUNT(rooms.room_id) AS count
FROM hotels,rooms,bookings
WHERE hotels.hotel_id=rooms.hotel_id
AND rooms.room_id=bookings.room_id
GROUP BY rooms.room_id
and this is the closest outcome i got.. ignore the hotel name language
This is the outcome that I've reached so far,
hotels
rooms
bookings p.1
bookings p.2 (remaining records)
hotel_id 1 has 5 rooms, the room_number 300 made the most profit. I want to show the most profit only of each hotel. I don't need the other rooms that made less profit.
Update:
So i solved a similar query where i want to get the best 2 rooms that made the most profit. But, i just can't think of any function that can give me only best profit of each hotel. little hint or help would be appriciated
Try this query:
SELECT * FROM
(SELECT hotels.hotel_id,rooms.room_id,hotel_name,room_number,SUM(rooms.room_price) AS profit,COUNT(rooms.room_id) AS COUNT
FROM hotels,rooms,bookings
WHERE hotels.hotel_id=rooms.hotel_id
AND rooms.room_id=bookings.room_id
GROUP BY rooms.room_id) a GROUP BY hotel_id;
Edit:
This might do it:
SELECT hotel_id,room_id,room_number,MAX(a.tc) AS "Count",MAX(tp) AS "MostProfit" FROM
(SELECT hotel_id,rooms.room_id,room_number,COUNT(rooms.room_id) AS "tc",SUM(room_price) AS "tp" FROM rooms JOIN bookings
ON rooms.room_id=bookings.room_id
GROUP BY rooms.room_id) a GROUP BY hotel_id
Please try below once:
SELECT RO_BOOK.HOTEL_ID,
RO_BOOK.ROOM_ID,
RO_BOOK.ROOM_NUMBER,
RO_BOOK.TOTAL_BOOKINGS,
MAX(RO_BOOK.TOTAL_EARNINGS) PROFITS
FROM(
SELECT ROOMS.HOTEL_ID
ROOMS.ROOM_ID,
ROOMS.ROOM_NUMBER,
COUNT(ROOMS.ROOM_ID) TOTAL_BOOKINGS
SUM(ROOMS.ROOM_PRICE) TOTAL_EARNINGS
FROM
ROOMS, BOOKINGS
WHERE
BOOKINGS.ROOM_ID = ROOMS.ROOM_ID
GROUP BY ROOMS.ROOM_ID) RO_BOOK
GROUP BY RO_BOOK.HOTEL_ID ;
It is similar to #tcadidot0 code, but column MAX(a.tc) AS "Count" return the maximum count irrespective of ROOM_ID.
For eg:
if hotel 1 has 2 rooms say, R100 and R200. The Cost of R100 be 1000 and R200 be 100.
No of times R100 booked be 1, and R200 be 3.
So the query would return:
HOTEL 1 , R100, COUNT 2, PROFIT 1000.
Please correct me if I got the question wrong.
I have these two tables
The first one is expenses table and the second one is expensename
Exp_Type(first table) is the Expense name(second table) as 2 is Food
I am trying to group expense according to expense type and get data between certain dates.
This is what i have tried, but it wont work.
select
(select
(select name from EXPENSENAME where id=EXP_TYPE)as ExpenseType,
sum(PRICE) as cost
from EXPENSES WHERE USERID=1 GROUP BY EXPENSES.EXP_TYPE),
[date]
from EXPENSES where [date] BETWEEN '10-09-2015' and '10-18-2015 23:59:59'
And
select
(select name from EXPENSENAME where id=EXP_TYPE)as ExpenseType,
sum(PRICE) as cost,
date
from EXPENSES WHERE USERID=1 and DATE BETWEEN '01/10/2015' and '29/10/2015' GROUP BY EXPENSES.EXP_TYPE
With out date, i am getting result by this query but i need the same data between certain dates,please help
select
(select name from EXPENSENAME where id=EXP_TYPE)as ExpenseType,
sum(PRICE) as cost
from EXPENSES WHERE USERID=1 GROUP BY EXPENSES.EXP_TYPE
you want to join the tables together
SELECT en.name as ExpenseType, SUM(e.price) as cost
FROM expenses e
JOIN expensename en ON en.id = e.exp_type
WHERE e.date BETWEEN '10-09-2015' and '10-18-2015'
GROUP BY en.name
this should give you the cost per name
the current query you have is TERRIBLE... and this is why
SELECT (SELECT ... FROM ... WHERE ... ) as ...
this is creating a correlated subquery which is executing once for every row of the parent select. meaning if you have a table with 4 rows in it (SELECT ... FROM ... WHERE ... ) will execute 4 times scanning 16 rows (assuming its from the same table) in general that is a really really bad way to get data... if you have a million rows... well do the math, its a bad idea
I have a query regarding a query in MySQL.
I have 2 tables one containing SalesRep details like name, email, etc. I have another table with the sales data which has reportDate, customers served and link to the salesrep via a foreign key. One thing to note is that the reportDate is always a friday.
So the requirement is this: I need to find sales data for a 13 week period for a given list of sales reps - with 0 as customers served if on a particular friday there is no data. The query result is consumed by a Java application which relies on the 13 rows of data per sales rep.
I have created a table with all the Friday dates populated and wrote a outer join like below:
select * from (
select name, customersServed, reportDate
from Sales_Data salesData
join `SALES_REPRESENTATIVE` salesRep on salesRep.`employeeId` = salesData.`employeeId`
where employeeId = 1
) as result
right outer join fridays on fridays.datefield = reportDate
where fridays.datefield between '2014-10-01' and '2014-12-31'
order by datefield
Now my doubts:
Is there any way where i can get the name to be populated for all 13 rows in the above query?
If there are 2 sales reps, I'd like to use a IN clause and expect 26 rows in total - 13 rows per sales person (even if there is no record for that person, I'd still like to see 13 rows of nulls), and 39 for 3 sales reps
Can these be done in MySql and if so, can anyone point me in the right direction?
You must first select your lines (without customersServed) and then make an outer join for the customerServed
something like that:
select records.name, records.datefield, IFNULL(salesRep.customersServed,0)
from (
select employeeId, name, datefield
from `SALES_REPRESENTATIVE`, fridays
where fridays.datefield between '2014-10-01' and '2014-12-31'
and employeeId in (...)
) as records
left outer join `Sales_Data` salesData on (salesData.employeeId = records.employeeId and salesData.reportDate = records.datefield)
order by records.name, records.datefield
You'll have to do 2 level nesting, in your nested query change to outer join for salesrep, so you have atleast 1 record for each rep, then a join with fridays without any condition to have atleast 13 record for each rep, then final right outer join with condition (fridays.datefield = innerfriday.datefield and (reportDate is null or reportDate=innerfriday.datefield))
Very inefficient, try to do it in code except for very small data.
I have two tables: sales, actions
Sales table:
id, datetime, status
--------------------
Actions table:
id, datetime, sales_id, action
------------------------------
There's a many-to-one relations ship between the actions and sales tables. For each sales record, there could be numerous actions. I am trying to determine, by each hour of the day, what the average time difference is between when sales records are first created, and when the first action record associated with it's respective sales record was created.
In other words, how fast (in hours) are sales agents responding to leads, based on what hour of the day the lead came in.
Here's what I tried:
SELECT
FROM_UNIXTIME(sales.datetime, '%H') as Hour,
count(actions.id) AS actions,
(MIN(actions.datetime) - sales.datetime) / 3600 as Lag
FROM
actions
INNER JOIN sales ON actions.sales_id = sales.id
group by Hour
I get what looks like reasonable hours numbers for 'Lag', but I am not convinced they're accurate:
Hour Actions Lag
00 66 11.0442
01 30 11.2758
02 50 8.2900
03 25 5.7492
.
.
.
23 77 34.4744
My question is, is this the correct way to get the values for the first action that was recorded for a given sales record? :
(MIN(actions.createDate) - sales.createDate) / 3600 as Lag
It should be:
MIN(actions.datetime - sales.datetime) / 3600 AS Lag
You way is getting the first action from any sale within the hour, and subtracting each sale's timestamp from its timestamp. You need to do the subtraction only within actions and sales that are joined by the ID.
This query has two layers, and it's helpful to crawl through them both.
The lowest layer should compute the lag time from sales.datetime to the earliest action.datetime for each row of sales. That will probably use a MIN() function.
The next layer will compute the statistics for those lag times, worked out in the lowest layer, by hour of the day. That will use an AVG() function.
Here's the lowest layer:
SELECT s.id, s.datetime, s.status,
TIMEDIFF(SECOND, MIN(a.datetime), s.datetime) AS lag_seconds
FROM sales AS s
JOIN actions AS a ON s.id = a.sales_id AND a.datetime > s.datetime
GROUP BY s.id, s.datetime, s.status
The second part of that ON clause makes sure that you only consider actions taken after the sales order was entered. It may be unnecessary, but I thought I'd throw it in.
Here's the second layer.
SELECT HOUR(datetime) AS hour_Sale_entered,
COUNT(*) AS number_in_that_hour,
AVG(lag_seconds) / 3600.0 AS Lag_to_first_action
FROM (
SELECT s.id, s.datetime, s.status,
TIMEDIFF(SECOND, MIN(a.datetime), s.datetime) AS lag_seconds
FROM sales AS s
JOIN actions AS a ON s.id = a.sales_id AND a.datetime > s.datetime
GROUP BY s.id, s.datetime, s.status
) AS d
GROUP BY HOUR(datetime)
ORDER BY HOUR(datetime)
See how there are two nested aggregations (GROUP BY) operations? The inner one identifies the first action, and the second one does the hourly averaging.
One more tidbit. If you want to include sales items that have not yet been acted on, you can do this:
SELECT HOUR(datetime) AS hour_Sale_entered,
COUNT(*) AS number_in_that_hour,
SUM(no_action) AS not_acted_upon_yet,
AVG(lag_seconds) / 3600.0 AS Lag_to_first_action
FROM (
SELECT s.id, s.datetime, s.status,
TIMEDIFF(SECOND, MIN(a.datetime), s.datetime) AS lag_seconds,
IFNULL(a.id,1,0) AS no_action
FROM sales AS s
LEFT JOIN actions AS a ON s.id = a.sales_id AND a.datetime > s.datetime
GROUP BY s.id, s.datetime, s.status
) AS d
GROUP BY HOUR(datetime)
ORDER BY HOUR(datetime)
The average of lag_seconds will still be correct, because the sales rows with no action rows will have NULL values for that, and AVG() ignores nulls.
I am trying to count sales made by a list of sales agents, this count is made every few minutes and updates a screen showing a 'sales leader board' which is updates using a Ajax call in the background.
I have one table which is created and populated every night containing the agent_id and the total sales for the week and month. I create a second, temporary table, on the fly which counts the sales for the day.
I need to combine the two tables to create a current list of sales for all agents in agent_count.
Table agent_count;
agent_id (varchar),
team_id (varchar),
name (varchar),
day(int),
week(int),
month(int)
Table sales;
agent_id (varchar),
day(int)
I can't figure out how to combine these tables. I think I need to use a join as all agents must be returned - even if they don't appear in the agent_count table.
First I make a simple call to get the week and month totals for all agents
SELECT agent_id, team_id, name, week, month FROM agent_count;
the I create a temporary table of todays sales, and then I count the sales for each agent for the day
CREATE TEMPORARY TABLE temp_todays_sales
SELECT s.id, s.agent_id
FROM sales s
WHERE DATEDIFF(s.uploaded, NOW()) = 0
AND s.valid = 1;
SELECT tts.agent_id, COUNT(tts.id) as today
FROM temp_todays_sales tts
GROUP BY tts.agent_id;
What is the best/easiet way to combine these to end up with a resultset such as
agent_id, team_id, name, day, week, month
where week and month also include the daily totals
thanks for any help!
Christy
SELECT s.agent_id, ac.team_id, ac.name,
s.`day` + COALESCE(ac.`day`, 0) AS `day`,
s.`day` + COALESCE(ac.`week`, 0) AS `week`,
s.`day` + COALESCE(ac.`month`, 0) AS `month`
FROM sales s
LEFT JOIN
agent_count ac
ON ac.agent_id = s.agent_id
team_id and name will be NULL if there is no record in agent_count for an agent.
If the agents can be missing from both tables, you normally would need to make a FULL JOIN but since MySQL does not support the latter you may use its poor man's substitution:
SELECT agent_id, MAX(team_id), MAX(name),
SUM(day), SUM(week), SUM(month)
FROM (
SELECT agent_id, NULL AS team_id, NULL AS name, day, day AS week, day AS month
FROM sales
UNION ALL
SELECT *
FROM agent_count
) q
GROUP BY
agent_id