SSAS 2008 Partitions - sql-server-2008

I have a fairly big SSAS cube. The measures are partitioned to contain 12 months of data. I have around 10 partitions for each measure. The problem is that every month after a new set of data is added, the data from exactly 12 month before 'disappears' from the cube. For example adding Jan 2017 data, makes Jan 2016 data disappear from the view (accessed from Excel) even though the data is in the database.
After another round of cube processing, this sets itself right. Any explanation to this phenomenon? I can guess that probably the cube is realigning the partitions with an extra month added. How to solve this issue? Thanks!

Related

How to do calculated Attributes in a database

I hope you can help me to understand how to solve my issue. Bases is the creation of a database where working hours of employees should be stored.
With these hours a little math is involved and the combination of database+math seems to be a problem.
I want to store the theoretically workable hours of an employee:
52(weeks a year)*40 hours a week = 2080 Minus holiday etc -> 1900 hours expected time yearly.
His actually worked time would go up by 8 hours each day until he reaches 1900, what would already be an issue. Meaning i dont really know how to implemnt that.
But the problem continues:
This time shall be split between 12 months equally. Okay so 1900 devided by 12 in 12 different columns... sounds stupid but now he reports sick in february and his actual time decreases within this month and accordingly his overall working time decreases as well.
Also there are things like parttime workers or people taking a sabbathical and also these hours need to be conncted to different projects (another table in same db)
In excel this issue was relatively easy to solve but with a pure db Iam kinda lost as how to approach it.
So 2 questions: is something like this even possible in a Mysql DB (i somehow doubt it).
And how would i do it?( in the db or with some additional software/frontend)
It sounds like you are trying to make a DTR system (Daily Time Reoords). I do recommend to design the database that will cater flexible scenarios to all types of employees. A common ground of storing information (date and time) that be able to calculate their working hours of these people.
You worry about the algorithms later it will follow based on your database design.

Dynamically calculating a throughput period

i'm trying to find an elegant way of calculating a throughput period.
I currently have two dates:
1 Nov 2017
31 Jan 2018
If I have a record that falls in between these two dates it will be set to have throughput period 1.
As time progresses my records might have a date that is past 31 Jan and it needs to fall in the second period, so period 2 etc.
This continues on until the end of time(potentially) - My current setup is a linking table with about 7 sets of different throughput periods(preset). I use this table to join to in order to determine the period that the report is pulling for.
This isn't the greatest way of doing it and I dont (yet) have the ability to create code that dynamically calculates it, any ideas how SQL can be used to calculate this on the fly?
Looking to brainstorm here.
Thanks!
If you have preset ranges then a case statement could work.

Daylight saving in timestamps

I'm running a matlab function (fastinsert) to insert data into MySQL. The results are correct for the whole year except for 1 hour in March, during daylight saving. In fact it seems that I cannot insert data between 2:00am and 3:00am on that day.
For example with:
ts = 2006 3 26 2 30 0
looking within the matlab function I found that the problem lies into:
java.sql.Timestamp(ts(1)-1900,ts(2)-1,ts(3),ts(4),ts(5),secs,nanosecs)
that gives as a result:
2006-03-26 03:30:00.0
How can I solve this?
I've run into similar problems in storing datetime on many occasions. Treating the value as a derived value seems to make the most sense. In other words, instead of storing the local time store the value as GMT and Time Zone. Then derive the appropriate value when you query the data.
This has the added benefit of making it possible to store values from multiple locations without having to worry about confusion down the road.

SSIS 2012 incremental load for 200-300 tables

My project started 2 months ago, and already I'm transferring over 100 tables on each process I run to the Data Warehouse.
I'll probably reach 200-300 tables pretty soon and do not believe my current development approach will scale.
I still have 3 weeks' versions (product dev sprints), and tables are still changing their structure (data types, column names, new columns etc) which gives me a real headache, so I ignored it for the first few weeks.
How did I ignore it?
Truncated all the tables before i take them to the ODS (Operational Data Store)/MRR layer
Took all the data from the source system fully to the MRR layer
Created only the dimensions "incremental" table (which still changes every week with new columns and changed data types)
Dynamically creating and populating the staging tables and the warehouse tables.
Now my model has started to form, so I have to take care of the incremental loads
It seems easy since I have updatetime for each record, but I also have deletions in my source system, how can I approach this ?
I've considered CDC, but this will be time consuming as I have to put it table by table
Any solutions for someone who starts with 100-200 tables?
I follow a style similar to your "ignore it" design for as long as possible. Full refresh keeps your design agile and can go as fast as 1 million rows per minute.
When this eventually runs out of legs, and there are deletions in the source system, I delete all my data back for a date range (e.g. 3 months) as agreed with the data experts. You may have to break that delete into chunks e.g. day by day. I also try to fully refresh such data e.g. each weekend (as often the data experts are misinformed).

good design for a db that decreases resolution of accuracy after time

I have something like 20,000 data points in a database and I want to display it on the google annotated graph. I think around 2000 points would be a good number to actually use the graph for, so I want to use averages instead of the real amount of data points I have.
This data counts the frequency of something a certain time. it would be like Table(frequency, datetime)
So for the first week I will have datetime have an interval of every 10 minutes, and frequency will be an average of all the frequencies of that time interval (of 10 minutes). Similarly, for the month after that I will I have a datetime interval of an hour etc.
I think this is something you can see on google finance too, after some time the resolution of the datapoints decreases even when you zoom in.
So what would be a good design for this? Is there already a tool that exists to do something like this?
I already thought of (though it might not be good) a giant table of all 20,000 points and several smaller tables that represent each time interval (1 week, 1 month etc) that are built through queries to the larger table and constantly updated and trimmed with new averages.
Keep the raw data in the db in one table the. Have a second reprti g table which you use a script or query to populate from the raw table. The transformation that populates the reporting table can group and average the buckets however you want. The important thing Is to not transform your data on initial insert--keep all your raw data. That way you can always rollback or rebuild if you mess something up.
ETL. Learn it. Love it. Live it.