Column name collision in SQL Server view joining database architecture

Column name collision in SQL Server view joining database architecture - sql-server-2008

First of all, sorry the way I write, bad English here.
Please, consider the following scenario:
I made my database architecture combining views as such:
Entity City has the columns { city_id, city_name, state_name, ... }
Entity User has the columns { user_id, user_name, user_login, city_id, ... }
Each entity has a view, the city view does not have FK to any other entity, so its a simple select.
User entity have one FK to city entity, so it makes a join with the city entity in the view.
As the following example:
create view vw_user as
select user.user_id,
user.user_name,
user.user_login,
vw_city.*
from user (nolock)
inner join vw_city (nolock) on vw_city.city_id = user.city_id
go
So if you have an entity like user_access with a FK to user, the view of user_access will have an inner join to vw_user, and it will bring up all the columns from user and city entities.
In the end, I just make a query on the view, and it returns the full entity with all the foreign references.
This works great, and make the maintenance procedures very easier, I have a stored procedure that recompile all the views if any change is needed to one of the entities.
But, it has a problem, and I discovered this in a very bad situation, the system that uses this architecture is working very fast and smooth already, and I need to join two entities that share the entity city for example, so it generates a collision of columns, SQL Server does not allow this in views, you can execute a query with collision with no problem, but you can't do it in a view.
So, this is my problem, I need to find a way to fix this, but I can't find an answer to it myself.
The only thing I ended up doing is making a hard-coded view with the columns renamed.
But I want a solution for the entire architecture, something permanent, like an upgrade.
This architecture is followed in code, with C#, so there is a dependence on the column names.
When I read the data from the query, I need to get the data from the column name, so this code can be reused with other classes, doing the same thing that the view does in the query, in the c# code.
So, getting data with ordinal is out of the question.
Any ideas?
Thanks, and sorry, but this is very hard to describe.

Well, just to close the subject, as SMC suggested i have used aliases to identify the columns and the tables in my queries.
So the query looks like this now - using the example from the topic:
Select 1.user_id as 1,
1.user_name as 2,
1.user_login as 3,
1.city_id as 4,
2.city_id as 5,
2.city_name as 6,
2.state_name as 7
from user as 1
inner join city as 2 on 2.city_id = 1.city_id
Just notice the columns 4 and 5, they will not collide now :)
Wen my application makes this query he also make the Classes involved to know their aliases in the current query.
It works fine, solved my problem and i don't need to use views anymore.
In the end, the application is faster then before.
That's it :) hope it helps anybody out there.

It's because the Column Name in each View or Function must be unique. You need to be specific while selecting the column Names. Use Alias and then refer the column Names.
You tried to create VIEW using a single column name more than once in the statement.
Remove * and give explicit column name

Related

MySQL Command what does a point mean?

I'm a newbie in mysql and have to write a implemention for a custom mysql asp.net identity storage.
I follow this tutorial and the first steps are done.
https://learn.microsoft.com/en-us/aspnet/identity/overview/extensibility/implementing-a-custom-mysql-aspnet-identity-storage-provider
Now i have the follow mysql command:
"Select Roles.Name from UserRoles, Roles where UserRoles.UserId = #userId and UserRoles.RoleId = Roles.Id"
My problem is now that i dont know how the table have to look for this request?
I would say:
Tablename : Roles
Select: Roles and Name? or is it a name?
same with UserRoles.UserID and UserRoles.RoleId
What does the point mean?
Thanks a lot

You question is quite unclear, however, if I understood correctly, you can't figure out clearly how the database schema you are using is structured and what you'll get from this query.
The query you have written SELECTs the data field called Name from the table called Roles. In order to do this, the query uses data coming from two tables: one is the Roles table itself, the other is called UserRoles.
It will extract Names data from the Roles table only for the Roles entries that have the Id field matching with the RoleId field of the entries in the UserRoles table that have the UserId equal to the given #UserId.
In other words, this SELECT query in will give you as a result a list of Names coming from the entries in the Roles table which match the given conditional check, which is what is written after the where SQL condition: where UserRoles.UserId = #userId and UserRoles.RoleId = Roles.Id.
Finally, the point "." in SQL queries is used to disambiguate between fields (or columns, if you want to call it so) with same name but coming from different tables. It is quite common that all the tables have an Id field, for example. You can identify the correct Id field in your database by writing Table1.Id, Table2.Id, and so on. Even if you don't have naming conflicts in your tables columns, adding the table name can be very good for code readability.
Edit:
As other users correctly pointed out in the comments to your question, you should also have a look to what an SQL JOIN operation is. Since you are searching data using information coming from different tables, you are actually doing an implicit JOIN on those tables.

Extremely basic SQL Misunderstanding

I'm preparing for an exam in databases and SQL and I'm solving an exercise:
We have a database of 4 tables that represent a human resources company. The tables are:
applicant(a-id,a-name,a-city,years-of-study),
job(job-name,job-id),
qualified(a-id,job-id)
wish(a-id,job-id).
the table applicant represents the table of applicants obviously. And jobs is the table of available jobs. the table qualified shows what jobs a person is qualified for, and the table wish shows what jobs a person is interested in.
The question was to write a query that displays for each job-id, the number of applicants that are both qualified and interested to work in.
Here is the solution the teacher wrote:
Select q1.job_id
, count(q1.a_id)
from qualified as q1
, wish as w1
Where q1.a_id = w1.a_id
and q1.job_id = w1.job_id
Group by job_id;
That's all well and good, I'm not sure why we needed that "as q1" and "as w1", but i can see why it works.
And here is the solution I wrote:
SELECT job-id,COUNT(a-id) FROM job,qualified,wish WHERE (qualified.a-id=wish.a-id)
GROUP BY job-id
Why is my solution wrong? And also - From which table will it select the information? Suppose I write SELECT job-id FROM job,qualified,wish. From which table will it take the information? because job-id exists in all 3 of these tables.

You can only refer to tables mentioned in the FROM clause. If it's ambiguous (because more than one has a column of the same name) then you need to be explicit by qualifying the name. Usually the qualifier is an alias but it could also be the table name itself if an alias wasn't specified.
There's a concept of a "natural join" which joins tables on common column(s) between two tables. Not all systems support that notation but I think MySQL does. I believe these systems usually collapse the joined pairs into a single column.
select q1.job_id, count(q1.a_id) from qualified as q1, wish as w1
where q1.a_id = w1.a_id and q1.job_id = w1.job_id
group by job_id;
I don't think I've worked on any systems that would have accepted the query above because the grouping column would have been strictly unclear even though the intention really is not. So if it truly does work correctly on MySQL then my guess is that it recognizes the equivalence of the columns and cuts you some slack on the syntax.
By the way, your query appears to be incorrect because you only included a single column in a join that requires two columns. You also included a third table which means that your result will effectively do a cross join of every row in that table. The grouping is going to still going to reduce it to one row per job_id but the count is going to be multiplied by the number of rows in the job table. Perhaps you added that table thinking it would hurt to add it just in case you need it but that is not what it means at all.

Your query will list non-existing jobs in case the database has orphan records in applicant and qualified, and might also omit jobs that have no qualified and willing candidates.
I'm not exactly sure, because I have no idea if there's any database that will accept COUNT(a-id) when there's no information about the table from which to take this value.
edit: Interestingly it looks like both of these problems are shared by both of the solutions, but shawnt00 has a point: your solution makes a huge pointless cartesian of three tables: see it without the group by.
My current best guess for a working answer would therefore be http://sqlfiddle.com/#!9/09d0c/6

How to set up relational database tables for this many-to-many relationship?

I have a type of data called a chain. Each chain is made up of a specific sequence of another type of data called a step. So a chain is ultimately made up of multiple steps in a specific order. I'm trying to figure out the best way to set this up in MySQL that will allow me to do the following:
Look up all steps in a chain, and get them in the right order
Look up all chains that contain a step
I'm currently considering the following table set up as the appropriate solution:
TABLE chains
id date_created
TABLE steps
id description
TABLE chains_steps (this would be used for joins)
chain_id step_id step_position
In the table chains_steps, the step_position column would be used to order the steps in a chain correctly. It seems unusual for a JOIN table to contain its own distinct piece of data, such as step_position in this case. But maybe it's not unusual at all and I'm just inexperienced/paranoid.
I don't have much experience in all this so I wanted to get some feedback. Are the three tables I suggested the correct way to do this? Are there any viable alternatives and if so, what are the advantages/drawback?

You're doing it right.
Consider a database containing the Employees and Projects tables, and how you'd want to link them in a many-to-many fashion. You'd probably come up with an Assignments table (or Project_Employees in some naming conventions).
At some point you'd decide you want not only to store each project assignment, but you'd also want to store when the assignment started, and when it finished. The natural place to put that is in the assignment itself; it doesn't make sense to store it either with the project or with the employee.
In further designs you might even find it necessary to store further information about the assignment, for example in an employee review process you may wish to store feedback related to their performance in that project, so you'd make the assignment the "one" end of a relationship with a Review table, which would relate back to Assignments with a FK on assignment_id.
So in short, it's perfectly normal to have a junction table that has its own data.

That looks fine, and it's not unusual for the join table to contain a position/rank field.
Look up all steps in a chain, and get them in the right order
SELECT * FROM chains_steps
LEFT JOIN steps ON steps.id = chains_steps.step_id
WHERE chains_steps.chain_id = ?
ORDER BY chains_steps.step_position ASC
Look up all chains that contain a step
SELECT DISTINCT chain_id FROM chains_steps
LEFT JOIN chains ON chains.id = chains_steps.chain_id

I think that the plan you've outlined is the correct approach. Don't worry too much about the presence of step_position on your mapping table. After all the step_position is a bit of data that is directly related to a step in the context of a chain. So the chains_steps table is the right place for it IMHO.
Some things to think about:
Foreign keys - use 'em!
Unique key on the chains_steps table - can a step be present in more than one position in a single chain? What about in different chains?
Good luck!

MySql - Select * from 2 tables, but Prefix Table Names in the Resultset?

I'd like to select * from 2 tables, but have each table's column name be prefixed with a string, to avoid duplicate column name collissions.
For example, I'd like to have a view like so:
CREATE VIEW view_user_info as (
SELECT
u.*,
ux.*
FROM
user u,
user_ex ux
);
where the results all had each column prefixed with the name of the table:
e.g.
user_ID
user_EMAIL
user_ex_ID
user_ex_TITLE
user_ex_SIN
etc.
I've put a sql fiddle here that has the concept, but not the correct syntax of course (if it's even possible).
I'm using MySql, but would welcome generic solutions if they exist!
EDIT: I am aware that I could alias each of the fields, as mentioned in one of the comments. That's what I'm currently doing, but I find at the start of a project I keep having to sync up my tables and views as they change. I like the views to have everything in them from each table, and then I manually select out what I need. Kind of a lazy approach, but this would allow me to iterate quicker, and only optimize when it's needed.

I find at the start of a project I keep having to sync up my tables and views as they change.
Since the thing you're trying to do is not really supported by standard SQL, and you keep modifying database structures in development, I wonder if your best approach would be to write a little script that recreates that SELECT statement for you. Maybe wrap it in a method call in the development language of your choice?
Essentially you'd need to query INFORMATION_SCHEMA for the tables and columns of interest, probably via a join, and write the results out in SQL style.
Then just run the script every time you make database structural changes that are important to you, and watch your code magically keep up.

Best way to do a query with a large number of possible joins

On the project I'm working on we have an activity table and each activity can be linked to one of about 20 different "activity details" tables...
e.g. If the activity was of type "work", then it would have a corresponding activity_details_work record, if it was of type "sick leave" then it would have a corresponding activity_details_sickleave record and so on.
Currently we are loading the activities and then for each activity we have a separate query to go fetch the activity details from the relevant table. This obviously doesn't scale well if you have thousands of activities.
So my initial thought was to have a single query which fetches the activities and joins the details in one go e.g.
SELECT * FROM activity
LEFT JOIN activity_details_1_work ON ...
LEFT JOIN activity_details_2_sickleave ON ...
LEFT JOIN activity_details_3_travelwork ON ...
...etc...
LEFT JOIN activity_details_20_yearleave ON ...
But this will result in each record having 100's of fields, most of which are empty and that feels nasty.
Lazy-loading the details isn't really an option either as the details are almost always requested in the core logic, at least for the main types anyway.
Is there a super clever way of doing this that I'm not thinking of?
Thanks in advance

My suggestion is to define a view for each ActivityType, that is tailored specifically to that activity.
Then add an index on the Activity table lead by the ActivityType field. Cluster said index unless there is an overwhelming need for some other to be clustered (or performance benchmarking shows some other clustering selection to be more performant).
Is there a particular reason why this degree of denormalization was designed in? Is that reason well known?

Chances are your activity tables are like (date_from, date_to, with_who, descr) or something to that effect. As Pieter suggested, consider tossing in a type varchar or enum field in there, so as to deal with a single details table.
If there are rational reasons to keep the tables apart, consider adding triggers that maintain boolean/tinyint fields (has_work, has_sickleave, etc), or a bit string (has_activites_of_type where the first position amounts to has_work, the next to has_sickleave, etc.).
Either way, you'll probably be better off by fetching the activity's details in one or more separate queries -- if only to avoid field name collisions.

I don't think enum is the way to go, because as you say there might be 1000's of activities, then altering your activity table would become an issue.
There is no point doing a left join on a large number of tables either.
So the options that you have are :
See this The first comment might be useful.
I am guessing that your activity table has a field called activity_type_id.
Build a table called activity_types containing fields activity_type_id, activity_name, activity_details_table_name. First query in the following way
activity
inner join
activity_types
using( activity_type_id )
This query gives you the table name on which to query for the details.
This way you can add any new activity type just by adding a row in the activity_types table.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008