Specific SQL queries with JOIN vs. multiple methods - mysql

I have a NodeJS app and a mysql database. I use the mysql npm package to access it.
Very often in my SQL schema I have relation between tables like this :
Table User :
SERIAL id
...
Table Group
SERIAL id
...
Table Group_Constraints
SERIAL ConstraintId
...
Table UserToGroup
BIGINT UserId
BIGINT GroupId
Table GroupToConstraint
BIGINT GroupId
BIGINT ConstraintId
Now, In my User model, I want to have a function that gives me the users, their groups and the user's group properties.
For now, I do a big custom SQL request with some JOINS. It works, but it leads me with many "custom" functions like getUsersWithGroupAndConstraints.
As I have this design in many different place of my code, it leads to something very hard to maintain.
I wish my code where a bit more generic and have a User/Group/Constraint model,that I would then Query this way :
User.getAll().forEach(user =>
Group.getAll(user).forEach(group =>
Constraint.getAll(group)))
But it would lead from 1 SQL query to User.length*Group.length SQL Queries.
I cannot find a way to achieve a clean design, without having a HUGE amount of SQL queries and therefore, I guess, very poor performances.
How can I do this ?

Related

MySQL Command what does a point mean?

I'm a newbie in mysql and have to write a implemention for a custom mysql asp.net identity storage.
I follow this tutorial and the first steps are done.
https://learn.microsoft.com/en-us/aspnet/identity/overview/extensibility/implementing-a-custom-mysql-aspnet-identity-storage-provider
Now i have the follow mysql command:
"Select Roles.Name from UserRoles, Roles where UserRoles.UserId = #userId and UserRoles.RoleId = Roles.Id"
My problem is now that i dont know how the table have to look for this request?
I would say:
Tablename : Roles
Select: Roles and Name? or is it a name?
same with UserRoles.UserID and UserRoles.RoleId
What does the point mean?
Thanks a lot
You question is quite unclear, however, if I understood correctly, you can't figure out clearly how the database schema you are using is structured and what you'll get from this query.
The query you have written SELECTs the data field called Name from the table called Roles. In order to do this, the query uses data coming from two tables: one is the Roles table itself, the other is called UserRoles.
It will extract Names data from the Roles table only for the Roles entries that have the Id field matching with the RoleId field of the entries in the UserRoles table that have the UserId equal to the given #UserId.
In other words, this SELECT query in will give you as a result a list of Names coming from the entries in the Roles table which match the given conditional check, which is what is written after the where SQL condition: where UserRoles.UserId = #userId and UserRoles.RoleId = Roles.Id.
Finally, the point "." in SQL queries is used to disambiguate between fields (or columns, if you want to call it so) with same name but coming from different tables. It is quite common that all the tables have an Id field, for example. You can identify the correct Id field in your database by writing Table1.Id, Table2.Id, and so on. Even if you don't have naming conflicts in your tables columns, adding the table name can be very good for code readability.
Edit:
As other users correctly pointed out in the comments to your question, you should also have a look to what an SQL JOIN operation is. Since you are searching data using information coming from different tables, you are actually doing an implicit JOIN on those tables.

MYSQL query - join or loop query

So I am working with a few tables in a database, and I'm wondering the best way to query it.
Here's the setup:
EVENTS:
int event_id
varchar event_name
date event_date
ATTENDANCE:
int attendance_id
int event_id (foreign key for EVENTS)
int user_id (foreign key for USERS)
int status
USERS:
int user_id
varchar first_name
varchar last_name
varchar email
Pretty much what I was going to do, is have an event (the ID #) that I want to get the attendance for, and then query the attendance table for all records matching that event, then query the users table for all users referenced in attendance as part of that event.
The first thought that came to mind was to first query the database for all attendance entries and get an array, then loop through each record to query the user information. However this seems pretty inefficient and there must be a better way with joins or something of the like. I don't have much experience with joins, so I was wondering if I could get some help.
This is the pseudo code of what I was originally thinking:
SELECT * FROM attendance WHERE event_id = eventID
while (row exists):
SELECT FROM users WHERE user_id = attendanceUserID
get info export in xml...etc.
I don't think this is the best way to do this, so what would be the better way to do it?
The question is "join or loop?" and the technical answer is use a join. What you are describing is what joins are meant to do, combine tables on conditions.
"select from ... (select more)" isn't the way to go about it. Consider the idea really of "what do these two tables have that connects them in a totally reliable and identifiable way? The note in the above comment is spot on.
However, as an "old man", the question isn't quite as straightforward. Everything is a question of time, yours and the machine's. So ask yourself this: imagine method A is 100 times more efficient than method B. But, you already know how to do method B. AND... the difference is 0.02 milliseconds versus 2 milliseconds. (Let's just say you have a small data set, and a fast machine.) If you can code up method B in three minutes and get on with your project, that might be good. Especially if there's a deadline. Because everything is easier with a working example to start from, even if implemented in a different way. It gives you something to test the NEW method that you're just learning against. Lots of people chase efficiency before they even know if it would even matter or not.
Get things working first, make them faster second. (Of course, don't paint yourself into a horrible design corner. You haven't though, the database is fine, you're just looking at different ways of getting info out of it.)

SQL to update column in modified table

I am a reasonably competent SQL programmer but my skills are still pretty much in the domain of simple INSERT, SELECT, UPDATE statements with an occasional LIKE etc thrown in. What I am currently trying to do is rather more complex. Here is the scenario.
I have three tables.
Table 1, *users* identifies users via a User ID, uid. Users can have one or more sub accounts
Table 2 *accounts* keeps a record of subaccounts for each user with, amongst other things the columns uid and sid where uid is the one defined in the *users* table.
Table 3, *data* is currently storing some data, in a data column that is being associated with a particular subaccount, sid.
The thing I have just realized is that there is no particular reason to block users from using those data across subaccounts. No problem - I can change my data subset search SQL to work with the uid instead. However, given the frequency of such searches, it seems well worth while simply sticking in a uid column in *data*.
To do that I would need to write some smart SQL that would get uid,sid pairs from the *accounts* table and use that information to update the newly created uid column in the data table. This I have to admit is beyond my knowledge of SQL.
I should mention that the system using these data is now in production and has several 100s of users so the option of just acting like they are not there is not available. Not terribly relevant I think but I should mention that uid and sid are alphanumeric strinsg with both columns being indexed.
I would be most grateful to anyone here who might be able to help out with it.
Mysql can do updates based on joins and based on reading of your schema here's what I'd do...
UPDATE accounts a, data d
set d.uid=a.uid
where a.sid=d.sid
and d.uid is NULL

DB Design and Data Retrieval from a heavy table

I have a requirement to have 612 columns in my database table. The # of columns as per data type are:
BigInt – 150 (PositionCol1, PositionCol2…………PositionCol150)
Int - 5
SmallInt – 5
Date – 150 (SourceDateCol1, SourceDate2,………….SourceDate150)
DateTime – 2
Varchar(2000) – 150 (FormulaCol1, FormulaCol2………………FormulaCol150)
Bit – 150 (IsActive1, IsActive2,……………….IsActive150)
When user does the import for first time the data gets stored in PositionCol1, SourceDateCol1, FormulaCol1, IsActiveCol1, etc. (other datetime, Int, Smallint columns).
When user does the import for second time the data gets stored in PositionCol2, SourceDateCol2, FormulaCol2, IsActiveCol2, etc. (other datetime, Int, Smallint columns)….. so and so on.
There is a ProjectID column in the table for which data is being imported.
Before starting the import process, user maps the excel column names with the database column names (PositionCol1, SourceDateCol1, FormulaCol1, IsActiveCol1) and this mapping get stored in a separate table; so that when retrieved data can be shown under these mapping column names instead of DB column names. E.g.
PositionCol1 may be mapped to SAPDATA
SourceDateCol1 may be mapped to SAPDATE
FormulaCol1 may be mapped to SAPFORMULA
IsActiveCol1 may be mapped to SAPISACTIVE
40,000 rows will be added in this table every day, my questions is that will the SQL be able to handle the load of that much of data in the long run?
Most of the times, a row will have data in about 200-300 columns; in the worst case it’ll have data in all of the 612 columns. Keeping in view this point, shall I make some changes in the design to avoid any future performance issues? If so, please suggest what could be done?
If I stick to my current design, what points I should take care of, apart from Indexing, to have optimal performance while retrieving the data from this huge table?
If I need to retrieve data of a particular entity e.g. SAPDATA, I’ll have to go to my mapping table, get the database column name against SAPDATA i.e. PositionCol1 in this case; and retrieve it. But, in that way, I’ll have to write dynamic queries. Is there any other better way?
Don't stick with your current design. Your repeating groups are unweildy and self limiting... What happens when somebody uploads 151 times? Normalise this table so that you have one of each type per row rather than 150. You won't need mapping this way as you can select SAPDATA from the positioncol without worring if it is 1-150.
You probably want a PROJECTS table with an ID, a PROJECT_UPLOADS table with an ID and an FK to the PROJECTS table. This table would have Position, SourceDate, Formula and IsActive given your use-case above.
Then you could do things like
select p.name, pu.position from PROJECTS p inner join PROJECT_UPLOADS pu on pu.projectid = p.id WHERE pu.position = 'SAPDATA'
etc.

Joining a table stored within a column of the results

I want to try and keep this as one query and not use PHP, but it's proving to be tough.
I have a table called applications, that stores all the applications and some basic information about them.
Then, I have a table with all the types of applications in it, and that table contains a reference to another table which stores more specific data about the specific type of application in question.
select applications.id as appid, applications.category, type.title as type, type.id as tid, type.valuefld, type.tablename
from applications
left join type on applications.typeid=type.id
left join department on type.deptid=department.id
where not isnull(work_cat)
and work_cat != ''
and applications.deleted=0
and datei between '10-04-14' and '11-04-14'
order by type, work_cat
Now, in the old version, there is another query on every single result. Over hundreds of results... that sucks.
This is the query I'd like to integrate so I can get all the data in one result row. (Old is ASP, I'm re-writing it in PHP)
query = "select sum("&adors.fields("valuefld")&") as cost, description from "&adors.fields("tablename")&" where appid = '"&adors.fields("tablename")&"'"
Prepared statements, I'm aware, are the best solution, but for now they are not an option.
You can't do this with a plain SQL query - you need to have a defined set of tables that your query is based on. The fact that your current implementation queries from whatever table is named by tablename from the first result-set means that to get this all in one query, you will have to restructure your data. You have to know what tables you're querying from rather than having it dynamic.
If the reason for these different tables is the different information stored in each requiring different record (column) structures, you might want to look into Key/Value pair storage in a large table. Once you combine the dynamically named ones into a single location you can integrate your two queries together.