Retrieve a data from a partition in SQL Servertable - mysql

I'm new to SQL Server partitioning. I have tried a some of the tutorials and finally I got this based on that created a partition but the problem is that I can't retrieve the data's based on partition like in MYSQL query
SELECT * FROM TABLE_NAME PARTITION(partitionId)
and also tried this query in SQL Server
SELECT $PARTITION.partition_function (partition_function_id)
EDIT:
The picture shows the data in table I have created partition based on Date, I like to retrieve data based on partition name, which is if partition name is 20170203 I am excepting results on that date.

select * from TABLE_NAME
where $partition.fpartfunction(date) = 1
This returns all the data in partition 1.
If you want to retrieve it from the date you can do as below,
CREATE PARTITION FUNCTION PF_Date(date) AS RANGE LEFT FOR VALUES('2017-01-01','2018-01-01');
--first partition
SELECT *
FROM schema.table_name
WHERE key_column <= '2017-01-01';
--second partition
SELECT *
FROM schema.table_name
WHERE key_column > '2017-01-01' AND key_column <= '2018-01-01';
--last partition
SELECT *
FROM schema.table_name
WHERE key_column > '2018-01-01';

I hope this may work for your need:
CREATE PROCEDURE USP_GetRowsWithinPartition (#InputDate DATE)
AS
BEGIN
DECLARE #TV TABLE
(StartDate DATE,
EndDate DATE
)
INSERT INTO #TV (StartDate, EndDate)
SELECT TOP 1
CAST((CASE
WHEN (DATEPART(MONTH, #InputDate)) = 7 AND (DATEPART(DAY, #InputDate)) > 2
THEN DATEADD(YEAR, -1, CONVERT(DATE, prv.Value, 116))
WHEN (DATEPART(MONTH, #InputDate)) > 7 AND (DATEPART(DAY, #InputDate)) >= 1
THEN DATEADD(YEAR, -1, CONVERT(DATE, prv.Value, 116))
ELSE prv.Value
END) AS DATE) AS StartDate,
CAST((CASE
WHEN (DATEPART(MONTH, #InputDate)) = 7 AND (DATEPART(DAY, #InputDate)) > 2
THEN prv.Value
WHEN (DATEPART(MONTH, #InputDate)) > 7 AND (DATEPART(DAY, #InputDate)) >= 1
THEN prv.Value
ELSE DATEADD(YEAR, 1, CONVERT(DATE, prv.Value, 116))
END ) AS DATE) AS EndDate
FROM sys.partition_functions pf
INNER JOIN sys.partition_schemes AS ps ON ps.function_id = pf.function_id
INNER JOIN sys.destination_data_spaces AS dds ON dds.partition_scheme_id = ps.data_space_id
INNER JOIN sys.dm_db_partition_stats pstats ON pstats.partition_number = dds.destination_id
INNER JOIN sys.partitions p ON p.partition_id = pstats.partition_id
INNER JOIN sys.data_spaces AS ds ON dds.data_space_id = ds.data_space_id
INNER JOIN sys.partition_range_values AS prv ON pf.function_id = prv.function_id
WHERE pstats.object_id = OBJECT_ID('YourTable')
GROUP BY prv.boundary_id, prv.value
ORDER BY ABS(DATEDIFF(DAY, #InputDate, Convert(DATE, value, 116)))
SELECT yt.* FROM YourTable yt
INNER JOIN #TV t ON yt.DateColumn>=t.StartDate AND yt.DateColumn < t.EndDate
END
EXEC USP_GetRowsWithinPartition '2015-08-01'

(MySQL)
There is virtually no use for accessing a specific PARTITION. You should let the WHERE clause pick which partition(s) to use.
Furthermore, there are very few uses for partitioning. I recommend that you ignore that feature until you have a non-trivial amount of experience under your belt.
Any, to answer your question... Yes, this should work:
SELECT * FROM TABLE_NAME PARTITION(partitionId);
But that was not available until 5.6.
The partitionid may need to be quoted, especially if it is a number. For further research, please provide SHOW CREATE TABLE.

Related

mysql max and min subquery using date range

I have the following query:
SELECT
(Date + INTERVAL -(WEEKDAY(Date)) DAY) `Date`,
I would like to use a subquery here to get the oldest and newest inventory from the max and min Date:
(select sellable from clabDevelopment.fba_history_daily where Date =
max(Date))
max(Date), min(Date),
ASIN,
ItemSKU,
it.avgInv,
kt.Account, kt.Country, SUM(Sessions) `Sessions`, avg(Session_Pct)`Session_Pct`,
sum(Page_Views)`Page_Views`, avg(Page_Views_Pct)`Page_Views_Pct`, avg(Buy_Box_Pct)`Buy_Box_Pct`,
sum(Units_Ordered)`Units_Ordered`, sum(Units_Ordered_B2B) `Units_Ordered_B2B`,
avg(Unit_Session_Pct)`Unit_Session_Pct`, avg(Unit_Session_Pct_B2B)`Unit_Session_Pct_B2B`,
sum(Ordered_Product_Sales)`Ordered_Product_Sales`, sum(Total_Order_Items) `Total_Order_Items`, sum(Actual_Sales) `Actual_Sales`,
sum(Orders) `Orders`, sum(PPC_Revenue) `PPC_Revenue`, sum(PPC_Orders) `PPC_Orders`,
sum(Revenue)`Revenue`, sum(Sales_Tax_Collected) `Sales_Tax_Collected`, sum(Total_Ad_Spend) `Total_Ad_Spend`, sum(Impressions) `Impressions`,
sum(Profit_after_Fees_before_Costs) `Profit_after_Fees_before_Cost`
FROM clabDevelopment.KPI_kpireport as kt
left outer join
(SELECT Month(Date) as mnth, sku, account, country, avg(sellable)`avgInv` FROM clabDevelopment.`fba_history_daily`
where sellable >= 0
group by Month(Date), sku, account, country) as it
on kt.ItemSKU = it.SKU
and kt.Account = it.account
and kt.Country = it.country
and it.mnth = Month(kt.Date)
WHERE kt.Country = 'USA' or kt.Country = 'CAN'
GROUP BY Account, Country,(Date + INTERVAL -(WEEKDAY(Date)) DAY), ItemSKU
ORDER BY Date desc
The sub-query would be from the same table I am joining on the bottom except I group by month there. So I want to run this subquery and grab the value under sellable for the date of max(Date):
(select sellable from clabDevelopment.`fba_history_daily where Date = max(Date))
When I do it this way I get invalid use of group function.
Without known your schema and the engine/db it is difficult to understand the problem. But, here is a best guess with the following schema:
fba_history_daily
- mnth
- sku
- account
- country
- sellable
- SKU
KPI_kpireport
- Account
- Country
- ItemSKU
- Account
- Date
- Country
- ASIN
The following query would give you what you're looking for. This uses a GROUP_CONCAT in order to build the required results through aggregation. With the nested query join MySQL might be building a temporary table within memory to sort through those records which would not be optimal. You can check this using EXPLAIN and you would see Using temporary in the details.
SELECT
(Date + INTERVAL -(WEEKDAY(Date)) DAY) `Date`,
ASIN,
ItemSKU,
-- MIN
(SUBSTRING_INDEX(GROUP_CONCAT(it.sellable ORDER BY it.Date ASC),',', 1) AS minSellable),
-- MAX
(SUBSTRING_INDEX(GROUP_CONCAT(it.sellable ORDER BY it.Date DESC),',', 1) AS maxSellable),
-- AVG
AVG(it.sellable) avgInv,
kt.Account, kt.Country, SUM(Sessions) `Sessions`, avg(Session_Pct)`Session_Pct`,
sum(Page_Views)`Page_Views`, avg(Page_Views_Pct)`Page_Views_Pct`, avg(Buy_Box_Pct)`Buy_Box_Pct`,
sum(Units_Ordered)`Units_Ordered`, sum(Units_Ordered_B2B) `Units_Ordered_B2B`,
avg(Unit_Session_Pct)`Unit_Session_Pct`, avg(Unit_Session_Pct_B2B)`Unit_Session_Pct_B2B`,
sum(Ordered_Product_Sales)`Ordered_Product_Sales`, sum(Total_Order_Items) `Total_Order_Items`, sum(Actual_Sales) `Actual_Sales`,
sum(Orders) `Orders`, sum(PPC_Revenue) `PPC_Revenue`, sum(PPC_Orders) `PPC_Orders`,
sum(Revenue)`Revenue`, sum(Sales_Tax_Collected) `Sales_Tax_Collected`, sum(Total_Ad_Spend) `Total_Ad_Spend`, sum(Impressions) `Impressions`,
sum(Profit_after_Fees_before_Costs) `Profit_after_Fees_before_Cost`
FROM KPI_kpireport as kt
left outer join fba_history_daily it on
kt.ItemSKU = it.SKU
and kt.Account = it.account
and kt.Country = it.country
and Month(it.Date) = Month(kt.Date)
and it.sellable >= 0
WHERE kt.Country = 'USA' or kt.Country = 'CAN'
GROUP BY Account, Country,(Date + INTERVAL -(WEEKDAY(Date)) DAY), ItemSKU
ORDER BY Date desc

Calculate Date by number of working days from a certain Startdate

I've got two tables, a project table and a calendar table. The first containts a startdate and days required. The calendar table contains the usual date information, like date, dayofweek, and a column is workingday, which shows if the day is a saturday, sunday, or bank holiday (value = 0) or a regular workday (value = 1).
For a certain report I need write a stored procedure that calculates the predicted enddate by adding the number of estimated workddays needed.
Example:
**Projects**
Name Start_Planned Work_days_Required
Project A 02.05.2016 6
Calendar (04.05 is a bank holdiday)
Day Weekday Workingday
01.05.2016 7 0
02.05.2016 1 1
03.05.2016 2 1
04.05.2016 3 0
05.05.2016 4 1
06.05.2016 5 1
07.05.2016 6 0
08.05.2016 7 0
09.05.2016 1 1
10.05.2016 2 1
Let's say, the estimated number of days required is given as 6 (which leads to the predicted enddate of 10.05.2016). Is it possible to join the tables in a way, which allows me to put something like
select date as enddate_predicted
from calendar
join projects
where number_of_days = 6
I would post some more code, but I'm quite stuck on how where to start.
Thanks!
You could get all working days after your first date, then apply ROW_NUMBER() to get the number of days for each date:
SELECT Date, DayNum = ROW_NUMBER() OVER(ORDER BY Date)
FROM Calendar
WHERE IsWorkingDay = 1
AND Date >= #StartPlanned
Then it would just be a case of filtering for the 6th day:
DECLARE #StartPlanned DATE = '20160502',
#Days INT = 6;
SELECT Date
FROM ( SELECT Date, DayNum = ROW_NUMBER() OVER(ORDER BY Date)
FROM Calendar
WHERE WorkingDay = 1
AND Date >= #StartPlanned
) AS c
WHERE c.DayNum = #Days;
It's not part of the question, but for future proofing this is easier to acheive in SQL Server 2012+ with OFFSET/FETCH
DECLARE #StartPlanned DATE = '20160502',
#Days INT = 6;
SELECT Date
FROM dbo.Calendar
WHERE Date >= #StartPlanned
AND WorkingDay = 1
ORDER BY Date
OFFSET (#Days - 1) ROWS FETCH NEXT 1 ROWS ONLY
ADDENDUM
I missed the part earlier about having another table, and the comment about putting it into a cursor has prompted me to amend my answer. I would add a new column to your calendar table called WorkingDayRank:
ALTER TABLE dbo.Calendar ADD WorkingDayRank INT NULL;
GO
UPDATE c
SET WorkingDayRank = wdr
FROM ( SELECT Date, wdr = ROW_NUMBER() OVER(ORDER BY Date)
FROM dbo.Calendar
WHERE WorkingDay = 1
) AS c;
This can be done on the fly, but you will get better performance with it stored as a value, then your query becomes:
SELECT p.Name,
p.Start_Planned,
p.Work_days_Required,
EndDate = c2.Date
FROM Projects AS P
INNER JOIN dbo.Calendar AS c1
ON c1.Date = p.Start_Planned
INNER JOIN dbo.Calendar AS c2
ON c2.WorkingDayRank = c1.WorkingDayRank + p.Work_days_Required - 1;
This simply gets the working day rank of your start date, and finds the number of days ahead specified by the project by joining on WorkingDayRank (-1 because you want the end date inclusive of the range)
This will fail, if you ever plan to start your project on a non working day though, so a more robust solution might be:
SELECT p.Name,
p.Start_Planned,
p.Work_days_Required,
EndDate = c2.Date
FROM Projects AS P
CROSS APPLY
( SELECT TOP 1 c1.Date, c1.WorkingDayRank
FROM dbo.Calendar AS c1
WHERE c1.Date >= p.Start_Planned
AND c1.WorkingDay = 1
ORDER BY c1.Date
) AS c1
INNER JOIN dbo.Calendar AS c2
ON c2.WorkingDayRank = c1.WorkingDayRank + p.Work_days_Required - 1;
This uses CROSS APPLY to get the next working day on or after your project start date, then applies the same join as before.
This query returns a table with a predicted enddate for each project
select name,min(day) as predicted_enddate from (
select c.day,p.name from dbo.Calendar c
join dbo.Calendar c2 on c.day>=c2.day
join dbo.Projects p on p.start_planned<=c.day and p.start_planned<=c2.day
group by c.day,p.work_days_required,p.name
having sum(c2.workingday)=p.work_days_required
) a
group by name
--This gives me info about all projects
select p.projectname,p.Start_Planned ,c.date,
from calendar c
join
projects o
on c.date=dateadd(days,p.Work_days_Required,p.Start_Planned)
and c.isworkingday=1
now you can use CTE like below or wrap this in a procedure
;with cte
as
(
Select
p.projectnam
p.Start_Planned ,
c.date,datediff(days,p.Start_Planned,c.date) as nooffdays
from calendar c
join
projects o
on c.date=dateadd(days,p.Work_days_Required,p.Start_Planned)
and c.isworkingday=1
)
select * from cte where nooffdays=6
use below logic
CREATE TABLE #proj(Name varchar(50),Start_Planned date,
Work_days_Required int)
insert into #proj
values('Project A','02.05.2016',6)
CReATE TABLE #Calendar(Day date,Weekday int,Workingday bit)
insert into #Calendar
values('01.05.2016',7,0),
('02.05.2016',1,1),
('03.05.2016',2,1),
('04.05.2016',3,0),
('05.05.2016',4,1),
('06.05.2016',5,1),
('07.05.2016',6,0),
('08.05.2016',7,0),
('09.05.2016',1,1),
('10.05.2016',2,1)
DECLARE #req_day int = 3
DECLARE #date date = '02.05.2016'
--SELECT #req_day = Work_days_Required FROM #proj where Start_Planned = #date
select *,row_number() over(order by [day] desc) as cnt
from #Calendar
where Workingday = 1
and [Day] > #date
SELECT *
FROM
(
select *,row_number() over(order by [day] desc) as cnt
from #Calendar
where Workingday = 1
and [Day] > #date
)a
where cnt = #req_day

Get last data for contracts

I want to select last information about client's balance from MySQL's database. I wrote next script:
SELECT *
FROM
(SELECT
contract_balance.cid,
/*contract_balance.yy,
contract_balance.mm,*/
contract_balance.expenses,
contract_balance.revenues,
contract_balance.expenses + contract_balance.revenues AS total,
(CAST(CAST(CONCAT(contract_balance.yy,'-',contract_balance.mm,'-01')AS CHAR) AS DATE)) AS dt
FROM contract_balance
/*WHERE
CAST(CAST(CONCAT(contract_balance.yy,'-',contract_balance.mm,'-01')AS CHAR) AS DATE) < '2013-11-01'
LIMIT 100*/
) AS tmp
WHERE tmp.dt = (
SELECT MAX(b.dt)
FROM tmp AS b
WHERE tmp.cid = b.cid
)
But server return:
Table 'clientsdatabase.tmp' doesn't exist
How to change this code for get required data?
Try this one in your subquery you are trying to get the MAX of (CAST(CAST(CONCAT(contract_balance.yy,'-',contract_balance.mm,'-01')AS CHAR) AS DATE)) AS dt but in subquery your aliased table tmp doesn't exist so the simplest way you can do is to calculate the MAX of dt and use GROUP BY contract_balance.cid contractor id ,i guess it will fullfill your needs
SELECT
contract_balance.cid,
contract_balance.expenses,
contract_balance.revenues,
contract_balance.expenses + contract_balance.revenues AS total,
MAX((CAST(CAST(CONCAT(contract_balance.yy,'-',contract_balance.mm,'-01')AS CHAR) AS DATE))) AS dt
FROM contract_balance
GROUP BY contract_balance.cid
Try this:
SELECT *
FROM (SELECT cb.cid, cb.expenses, cb.revenues, cb.expenses + cb.revenues AS total,
(CAST(CAST(CONCAT(cb.yy,'-',cb.mm,'-01')AS CHAR) AS DATE)) AS dt
FROM contract_balance cb ORDER BY dt DESC
) AS A
GROUP BY A.cid

Optimize SQL Server Query

I have to do a query to get the total cost of previous month and compared to current month to calculate the percentage difference.
this is the script:
create table #calc
(
InvoiceDate Date,
TotalCost decimal (12,2)
)
insert into #calc values ('2013-07-01', 9470.36)
insert into #calc values ('2013-08-01', 11393.81)
and this is the query:
select InvoiceDate,
TotalCost,
PrevTotalCost,
(CASE WHEN (PrevTotalCost = 0)
THEN 0
ELSE (((TotalCost - PrevTotalCost) / PrevTotalCost) * 100.0)
END) AS PercentageDifference
from (
select a.InvoiceDate, a.TotalCost,
isnull((select b.TotalCost
from #calc b
where InvoiceDate = (select MAX(InvoiceDate)
from #calc c
where c.InvoiceDate < a.InvoiceDate)), 0) as PrevTotalCost
from #calc a) subq
Is there a more efficient way to do it for cgetting the previous month?
Using a ranking function to put more burden on sorts than table scans seems the fastest when using no indexes. The query below processed 6575 records in under a second:
SELECT
Main.InvoiceDate,
Main.TotalCost,
PreviousTotalCost=Previous.TotalCost,
PercentageDifference=
CASE WHEN COALESCE(Previous.TotalCost,0) = 0 THEN 0
ELSE (((Main.TotalCost - Previous.TotalCost) / Previous.TotalCost) * 100.00)
END
FROM
(
SELECT
InvoiceDate,
TotalCost,
OrderInGroup=ROW_NUMBER() OVER (ORDER BY InvoiceDate DESC)
FROM
Test
)AS Main
LEFT OUTER JOIN
(
SELECT
InvoiceDate,
TotalCost,
OrderInGroup=ROW_NUMBER() OVER (ORDER BY InvoiceDate DESC)
FROM
Test
)AS Previous ON Previous.OrderInGroup=Main.OrderInGroup+1
Using nested looping as the case when getting the previous invoice cost in a select subquery proves the slowest - 6575 rows in 30 seconds.
SELECT
X.InvoiceDate,
X.TotalCost,
X.PreviousTotalCost,
PercentageDifference=
CASE WHEN COALESCE(X.PreviousTotalCost,0) = 0 THEN 0
ELSE (((X.TotalCost - X.PreviousTotalCost) / X.PreviousTotalCost) * 100.00)
END
FROM
(
SELECT
InvoiceDate,
TotalCost,
PreviousTotalCost=(SELECT TotalCost FROM Test WHERE InvoiceDate=(SELECT MAX(InvoiceDate) FROM Test WHERE InvoiceDate<Main.InvoiceDate))
FROM
Test AS Main
)AS X
Your query processed 6575 records in 20 seconds with the biggest cost coming from the nested loops for inner join
select InvoiceDate,
TotalCost,
PrevTotalCost,
(CASE WHEN (PrevTotalCost = 0)
THEN 0
ELSE (((TotalCost - PrevTotalCost) / PrevTotalCost) * 100.0)
END) AS PercentageDifference
from (
select a.InvoiceDate, a.TotalCost,
isnull((select b.TotalCost
from Test b
where InvoiceDate = (select MAX(InvoiceDate)
from #calc c
where c.InvoiceDate < a.InvoiceDate)), 0) as PrevTotalCost
from Test a) subq
Using indexes would be a big plus unless you are required to use temp tables.
Hope this helps :)
SELECT
`current`.`InvoiceDate`,
`current`.`TotalCost`,
`prev`.`TotalCost` AS `PrevTotalCost`,
(`current`.`TotalCost` - `prev`.`TotalCost`) AS `CostDifference`
FROM dates `current`
LEFT JOIN
dates `prev`
ON `prev`.`InvoiceDate` <= DATE_FORMAT(`current`.`InvoiceDate` - INTERVAL 1 MONTH, '%Y-%m-01');
Screenshot of the results I got: http://cl.ly/image/0b3z2x1f2H1n
I think this might be what you're looking for.
Edit: I wrote this query in MySQL, so it's possible you may need to alter a couple minor syntax things for your server.

Checking for maximum length of consecutive days which satisfy specific condition

I have a MySQL table with the structure:
beverages_log(id, users_id, beverages_id, timestamp)
I'm trying to compute the maximum streak of consecutive days during which a user (with id 1) logs a beverage (with id 1) at least 5 times each day. I'm pretty sure that this can be done using views as follows:
CREATE or REPLACE VIEW daycounts AS
SELECT count(*) AS n, DATE(timestamp) AS d FROM beverages_log
WHERE users_id = '1' AND beverages_id = 1 GROUP BY d;
CREATE or REPLACE VIEW t AS SELECT * FROM daycounts WHERE n >= 5;
SELECT MAX(streak) AS current FROM ( SELECT DATEDIFF(MIN(c.d), a.d)+1 AS streak
FROM t AS a LEFT JOIN t AS b ON a.d = ADDDATE(b.d,1)
LEFT JOIN t AS c ON a.d <= c.d
LEFT JOIN t AS d ON c.d = ADDDATE(d.d,-1)
WHERE b.d IS NULL AND c.d IS NOT NULL AND d.d IS NULL GROUP BY a.d) allstreaks;
However, repeatedly creating views for different users every time I run this check seems pretty inefficient. Is there a way in MySQL to perform this computation in a single query, without creating views or repeatedly calling the same subqueries a bunch of times?
This solution seems to perform quite well as long as there is a composite index on users_id and beverages_id -
SELECT *
FROM (
SELECT t.*, IF(#prev + INTERVAL 1 DAY = t.d, #c := #c + 1, #c := 1) AS streak, #prev := t.d
FROM (
SELECT DATE(timestamp) AS d, COUNT(*) AS n
FROM beverages_log
WHERE users_id = 1
AND beverages_id = 1
GROUP BY DATE(timestamp)
HAVING COUNT(*) >= 5
) AS t
INNER JOIN (SELECT #prev := NULL, #c := 1) AS vars
) AS t
ORDER BY streak DESC LIMIT 1;
Why not include user_id in they daycounts view and group by user_id and date.
Also include user_id in view t.
Then when you are queering against t add the user_id to the where clause.
Then you don't have to recreate your views for every single user you just need to remember to include in your where clause.
That's a little tricky. I'd start with a view to summarize events by day:
CREATE VIEW BView AS
SELECT UserID, BevID, CAST(EventDateTime AS DATE) AS EventDate, COUNT(*) AS NumEvents
FROM beverages_log
GROUP BY UserID, BevID, CAST(EventDateTime AS DATE)
I'd then use a Dates table (just a table with one row per day; very handy to have) to examine all possible date ranges and throw out any with a gap. This will probably be slow as hell, but it's a start:
SELECT
UserID, BevID, MAX(StreakLength) AS StreakLength
FROM
(
SELECT
B1.UserID, B1.BevID, B1.EventDate AS StreakStart, DATEDIFF(DD, StartDate.Date, EndDate.Date) AS StreakLength
FROM
BView AS B1
INNER JOIN Dates AS StartDate ON B1.EventDate = StartDate.Date
INNER JOIN Dates AS EndDate ON EndDate.Date > StartDate.Date
WHERE
B1.NumEvents >= 5
-- Exclude this potential streak if there's a day with no activity
AND NOT EXISTS (SELECT * FROM Dates AS MissedDay WHERE MissedDay.Date > StartDate.Date AND MissedDay.Date <= EndDate.Date AND NOT EXISTS (SELECT * FROM BView AS B2 WHERE B1.UserID = B2.UserID AND B1.BevID = B2.BevID AND MissedDay.Date = B2.EventDate))
-- Exclude this potential streak if there's a day with less than five events
AND NOT EXISTS (SELECT * FROM BView AS B2 WHERE B1.UserID = B2.UserID AND B1.BevID = B2.BevID AND B2.EventDate > StartDate.Date AND B2.EventDate <= EndDate.Date AND B2.NumEvents < 5)
) AS X
GROUP BY
UserID, BevID