sorry for my noob question
I'm working on a project where I have to take data from an excel source and build a small DW from it. so I was thinking of these steps
Source > staging> OLETP > ODS > DW
in the requirements, I have to create an ODS layer so I'm not sure whats my process will be and what exactly needs to be inside the ODS, can you please let me know what steps should I take?
Thanks
As mater of fact ODS Operational Data Store(source) is a database that you need to create every relation or check constraint or any checks in DB for your business to check all data that you want use in data warehouse in this layer you must design database like Normal DB but not normalize and you can get any redundancy that you want.pay attention this design is not base on fact or dimension and it just wants to cleans data for data warehouse
Related
We're building a new piece of software for our company, where we want to manage our inventory.
The goal for the tool is to be customizable by the customer.
My part is mostly on the DB side. We have chosen MariaDB as our DB engine, and while we are working with the rather static functionality of a relational DB, we want to realize a rather dynamic solution.
Our chief programmer has explained to me the basics of the concept I shall implement into our DB:
We want a table which basically just consists of other tables.
Lets call it "maintable".
Maintable shall then reference its "attributes", which are the other tables.
For example, maintable references "Workstations".
"Workstations" then contains attributes like CPU, RAM, Drives, PSU etc..
And now comes the part which I didn't completely understand. The actual VALUES to these attributes in "Workstations" shall not be inserted into "Workstations". Instead, they are packed into another (junction?) table.
The reason for this approach is that the customer shall be able to customize the DB to his needs.
When the customer wants to add another attribute, he shall be able to do so. For example, if a new PSU now requires another attribute for an additional serial number, then the customer shall be able to simply create this new attribute in the front-end input form and then persist it to the DB.
If someone could point to good tutorials explaining this type of DB concept, then I would be glad as well! :=)
I am designing a reward system for my game. I have a table called VirtualItem (VI) (key, display_name), the data contained could be (gd, gold), (dm,diomond). Then I have a Reward table (id, reward_items,etc)
Currently reward_items are a json array of VIs: [[{key: dm, count:5},{key:gd, count10]].There is a web portal allowing user to CRUD reward_items.
My question is, should I use the current flat structure, or add another layer in between and use reference in reward_items instead? Something like reward_items: set_id(referring VirtualItemSet table).
Apparently, using flat structure(json array) will make the query easy. But I probably also need to put dislay_name inside the JSON as well. In addition, when VI changes, its hard to update reward_items.
Using relationship makes the db schema more complex, and make backend operation on CRUDs operation of reward_items complicated as well.(need to create VirtualItemSet item on the fly etc). It also make query more complicated. But it will support dynamic change of VI.
Whats your opinion on this? Or there is a better database for this type of senario?
Thanks,
Chen
As you notice: If you use flat column you can't make sql join and keep data consistency by database for foreign keys. In my practice this decision always make problems in future. So i think better don't use flat columns.
I'm getting daily dumps for a table(lets stay students table) from an external source. In order to reduce downtime while the table is being truncated and updated with the new data, I'm planning to maintain two copies of this table(students_1 and students_2).
Both these need to be mapped with Student model on an alternating daily basis. So if today I am using data from students_1, tomorrow, once data has been entered to students_2, I'll need to switch seamlessly to that one.
So my questions are
1) Is this approach good enough or is there a better one ?
2) For hot swapping tables, is it fine to just maintain a file indicating the current table being used and then set_table_name via a method which reads this particular file ? Is there a more elegant solution ?
You can do it as part of your data loading strategy, i wouldn't mess with storing table names or using non standard table names. After data is done loading, execute a table rename command instead, it is done atomically and should not interrupt your app.
RENAME TABLE students TO students_secondary_temp, students_secondary TO students, students_secondary_temp TO students_secondary;
I have a large database of client details, and I need to generate a totally new field of data based on a single other field. It would be a simple IF..THEN deal.
Example:
The source field has data that looks like this "BAR DIN" (Barrie Dinners) and I need to fill a new field with "Dinners".
From what I understand, Data Macros are the right way to do this, but I'd prefer not to buy Access 2010. There should be a way to do this with normal macros. This update only needs to be done once a year and can be done manually. I mostly looking for a way to avoid having to enter all that data manually for each customer.
Create a separate table to translate between the two:
source_field new_field
BAR DIN Barrie Dinners
FOO BROS Foo Brothers
Anytime you need to see the "new_field" values, JOIN that translation table to your original table (JOIN on source_field) to look them up. This approach is one of the fundamental reasons relational databases were created in the first place. This way your database will always be "up to date" without the need for any macros to populate a redundant field.
I'm working with a third party software package that is on it's own database. We are using it for the user management back bone on our application. We have an API to retrieve data and access info.
Due to the nature of information changing daily, we can only use the user_id as a pseudo FK in our application, not storing info like their username or name. The user information can change (like person name...don't ask).
What I need to do is sort and filter (paging results) one of my queries by the person's name, not the user_id we have. I'm able to get an array of the user info before hand. Would my best bet be creating a temporary table that adds an additional field, and then sorts by that?
Using MySQL for the database.
You could adapt the stored procedure on this page here to suit your needs the stored procedure is a multi purpose one and is very dynamic, but you could alter it to suit your needs for filtering the person table.
http://weblogs.asp.net/pwilson/archive/2003/10/10/31456.aspx
You could combine the data into an array of objects, then sort the array.
Yes, but you should consider specifically where you will make the temporary table. If you do it in your web application then your web server is stuck allocating memory for your entire table, which may be horrible for performance. On the other hand, it may be easier to just load all your objects and sort them as suggested by eschneider.
If you have the user_id as a parameter, you can create a user defined function which retrieves the username for you within the stored procedure.
Database is on different servers. For all purposes, we access it via an API and the data is then turned into an array.
For now, I've implemented the solution using LINQ to filter and out the array of objects.
Thanks for the tips and helping me go in the right direction.