Column prefixes in tables - mysql

First I need to point that I read Database columns type prefix but it's not the same issue.
A long time ago someone who I worked with told me that in project I took part, all columns need to have unique prefix.
For example for users table I use prefix u_ so all columns are named u_id, u_name and so on. The same with all other tables for example for products it will be p_ prefix.
The reason of that was easier SQL JOINS - all columns will have unique names if 2 or more tables would be join. To be honest I've used this suggestion so far but in fact I don't know if it is used by many of you or it's really so useful.
What's your opinion on that? Do you use such column naming or maybe you think this is completely unnecessary and waste of time? (when displaying data you need to use prefixes if you don't remove them using function or foreach)
EDIT
Just in case more explanation
Assume we have users table with fields id, name and address table with fields id, name, user_id
In case if this method is used if we want to get all fields we can do:
SELECT *
FROM users u
LEFT JOIN address a on u.u_id = a.a_user_id
And in case we don't use prefixes for columns we should use:
SELECT u.id AS `u_id`,
u.name AS `u_name`,
a.id AS `a_id`,
a.name AS `a_name`,
a.user_id
FROM users u
LEFT JOIN address a on u.id = a.user_id
assuming of course we want to use columns as names and not numeric indexes 0,1 and so on (for example in PHP)
EDIT2
It seems that I haven't explained enough what's the problem - in MySQL of course in both cases everything works just fine and that's not a problem.
However the problem is when I want to use data in programming language, for example PHP. If I use:
<?php
$db = new mysqli('localhost','root','','test_prefix');
$result = $db->query("SELECT * FROM `user` u LEFT JOIN `address` a ON u.id = a.user_id ");
while ($data = $result->fetch_array()) {
var_dump($data);
}
I get:
array(8) { [0]=> string(1) "1" ["id"]=> string(1) "1" 1=> string(5)
"Frank" ["name"]=> string(3) "USA" [2]=> string(1) "1" [3]=> string(3)
"USA" [4]=> string(1) "1" ["user_id"]=> string(1) "1" } array(8) {
[0]=> string(1) "2" ["id"]=> string(1) "2" 1=> string(4) "John"
["name"]=> string(6) "Canada" [2]=> string(1) "2" [3]=> string(6)
"Canada" [4]=> string(1) "2" ["user_id"]=> string(1) "2" }
Whereas result in PhpMyAdmin for that query look like this:
In PHP get all the data but I can access data using numerical indexes: $data[0], $data[1] and that's not very convenient. I cannot use user name because in $data['name'] there's only address name, the same in id. If I used any of both: aliases for columns or prefixes for columns I would be able to use string indexes for accessing data for example $data['user_name'] to access User name and $data['address_name'] to access Address name.

I believe this is stupid. You actually end up prefixing all columns in all queries with their table names (or "identifiers"), even where there is no ambiguity.
If you compare:
SELECT t1_col1, t2_col1
FROM t1, t2;
... with:
SELECT t1.col1, t2.col1
FROM t1, t2;
... then the recommendation may appear sensible.
Now compare:
SELECT t3_col3, t4_col4 FROM t3, t4;
... with:
SELECT col3, col4
FROM t3, t4; -- assuming col3 exists only in t3, and col4 only in t4
Now where is the benefit?
One can still argue that a one-or-two letter prefix is still preferable to a long table name:
SELECT t1_col1, t2_col1
FROM very_long_table_name1, very_long_table_name2;
But why bother with a prefix when you can do:
SELECT t1.col1, t2.col1
FROM very_long_table_name1 AS t1, very_long_table_name2 AS t2;
Actually, there could be cases where the prefix might come in handy (handy does not mean recommended in my mind). For example, some drivers (I'm thinking old PHP) may get confused when multiple columns of a result set have the same name (because they return rows as an array indexed by column name). The problem could still be worked around by aliasing the columns in the result set.

I think it's unnecessary, BUT in the name of consistency on an existing project you should maintain it or refactor the whole database.
Why it is unnecassary? Well take a look at the following query wich illustrates how you can get whatever you want out of the database
Also i think it's more readable and the alliasing works fine
In the cases where your column names collide wich doesn't work that well with some drivers you could use the AS statement to get that specific field because you can JOIN the same table twice wich gives you the exact same problem anyway when you use the prefixes
SELECT
`m`.*,
`u1`.`username` AS `sender`,
`u2`.`username` AS `receiver`
FROM `messages` `m`
INNER JOIN `users` `u1` ON `m`.`sender` = `u1`.`id`
INNER JOIN `users` `u2` ON `m`.`receiver` = `u2`.`id`

I'd go so far as to say it's a dangerous practice as it encourages sloppy coding.
Suppose you have 2 tables: User and Usage
select u_type, us_name
from User
inner join Usage on u_id = us_id
Which field is coming from where? You'd need to go look at the table structures to determine it and it can be tempting to make assumptions in these cases.
select u.type,us.name
from User us
inner join Usage u on us.id = u.id
Now you have all the information you need right in front of you.

Related

How should I index my tables and build my request to improve perfomance?

I'm using MySQL. For a MP system, I have two tables (+1 to list the conversations) :
_ msg_individus (= members of a conversation)
mi_mcid : id of the conversation
mi_uid : id of the user
mi_ustatus : status of the conversation for the user (opened or closed)
mi_datelecture : the last time (timestamp) the user opened the conversation
For now I indexed mi_mcid and mi_muid as primary key.
_ msg_messages (= messages of the conversation)
msg_id : id of the message
msg_uid : id of the user who wrote the message
msg_mcid : id of the conversation
msg_text : content of the message
msg_timestamp : when the message was posted
For now I indexed msg_id as primary key and msg_mcid as an index.
Here's the thing : I want to know if there is a message unread by the user. For that, I compare the last msg_timestamp and the mi_datelecture, if the first one is bigger than the second one, then there's something new.
But for some reason, the performance on this request is very bad and I can't figure out how to index properly and how to build my request in the best way to increase the performances.
This is what I built :
SELECT 1 FROM msg_messages as msg
WHERE msg.msg_uid != :u_id
AND msg.msg_status = "1"
AND msg.msg_mcid IN (SELECT mi.mi_mcid
FROM msg_individus as mi
WHERE mi.mi_uid = :uid
AND mi.mi_ustatus = "2"
AND mi.mi_datelecture < msg.msg_timestamp)
LIMIT 0,1
I tried to set some indexes on msg_status, mi_uid, mi_status for example but even if things are a little better, performances are sad haha. When I don't compare mi_datelecture and msg_timestamp, it takes like 0.05sec to process, while it takes 0.20sec when I do.
Thank you for your advises.
(from Comment) New attempt:
SELECT 1
FROM msg_messages as msg
WHERE msg.msg_uid != :u_id
AND msg.msg_status = "1"
AND EXISTS
(
SELECT *
FROM msg_individus as mi
WHERE mi.mi_mcid = msg.msg_mcid
AND mi.mi_uid = :uid
AND mi.mi_ustatus = "2"
AND (mi.mi_datelecture = "0"
OR mi.mi_datelecture < msg.msg_timestamp)
)
LIMIT 0,1
Create this index.
CREATE INDEX ON msg_individus
(mi_uid, mi_ustatus, mi_datelecture, mi_mcid );
It is a covering index suitable for your subquery. The subquery can be satisfied completely from the index.
If you need more help read this then ask another question.
If all you need is "existence", use EXISTS ( SELECT 1 ... ) instead of LIMIT 1.
Change IN ( SELECT ... ) into either a JOIN or EXISTS ( SELECT 1 ... ); either is likely to be faster.
And see my Comment that ponders where the query even provides the desired info.
Then we can, and should, discuss indexes.

How to change this SQL query to "SELECT DISTINCT" based on one column?

So I know that using SQL you can't do a DISTINCT based on one column, but looking at other answers it looks like it's possible to do this using a sub query. Could anyone help me do this with my SQL query? I've tried a bunch of different ways and can't seem to figure out how to make it work.
I'm trying to SELECT DISTINCT based on the emplid column. For example, if there are 2 rows with the same emplid, I only want one of them.
SELECT DISTINCT t.TenantID, rt.term, t.emplid, t.staff, loc.locationName, tt.comment, l.lengthName
FROM TenantTerm tt
INNER JOIN Tenant t ON t.TenantID = tt.TenantID
INNER JOIN RentTerm rt ON rt.TenantID = t.TenantID
INNER JOIN length l ON l.lengthID = tt.lengthID
INNER JOIN location loc ON loc.locationID = tt.locationID
WHERE tt.assigned='0' AND rt.term>='$currentTerm' ORDER BY t.TenantID"
And some sample output data:
["TenantID"] => string(3) "535"
["term"]=> string(4) "2137"
["emplid"]=> string(7) "1855280"
["staff"]=> string(1) "0"
["locationName"]=> string(12) "BuildingOne"
["comment"]=> string(0) ""
["lengthName"]=> string(13) "Academic Year"
Note: Edited to increase clarity and to include sample data.
What you are asking doesn't really make sense, the DISTINCT applies to all columns in the SELECT.. so if you want just unique emplid values returned, what do you expect in the other columns?
If these are consistent already you wouldn't be asking this question, and if you don't care about them, why are you returning them?
The simple answer is:
SELECT DISTINCT t.emplid
FROM TenantTerm tt
JOIN Tenant t
ON t.TenantID = tt.TenantID
JOIN RentTerm rt
ON rt.TenantID = t.TenantID
JOIN length l
ON l.lengthID = tt.lengthID
JOIN location loc
ON loc.locationID = tt.locationID
WHERE tt.assigned='0'
AND rt.term>='$currentTerm'
ORDER BY t.TenantID

Is there a way to identify those records not found within a where IN() statement?

From PHP Code $Lines is defined as a list of accessions e.g. 123,146,165,1546,455,155
plant table has sequential records with the highest idPlant (unique identifier) of say 1000.
My simple SQL Query:
SELECT * FROM plant WHERE `plant`.idPlant IN($Lines) order by plant.idPlant;
This brings back row data for '123,146,165' etc.
Is there away to be told that '1546' was not found? (and thus the user probably entered a typo, I can not use a 'confirm all numbers are below X' because in the real data the idPlant may not be sequential and the upper bound will increase during use).
Update:
Looking to get an output that will tell me what Numbers were not found.
You can build up a sub query using unions that returns a list of all your values, then LEFT JOIN against that, checking for NULL in the WHERE clause to find the non matching values.
Basic php for this would be something like this:-
<?php
$sub_array = explode(',', $Lines);
$sub = '(SELECT '.implode(' AS i UNION SELECT ', $sub_array).' AS i) sub0';
$sql = "SELECT sub0.i
FROM $sub
LEFT OUTER JOIN plant
ON plant.idPlant = sub0.i
WHERE plant.idPlant IS NULL";
?>
You can create a temporary table and compare it to the original table. It goes something like this:
CREATE TEMPORARY TABLE IF NOT EXISTS plantIDs (
ID INT(11) NOT NULL UNIQUE,
found INT(11) NOT NULL);
INSERT INTO plantIDs(ID) VALUES (123),(146),(165),(1546),(455),(155);
SELECT plantIDs.ID, COALESCE(plant.name, "Not Found") as PlantName, plant.* FROM plant RIGHT JOIN plantIDs ON plant.idPlant=plantIDs.ID ORDER BY plantIDs.ID;
Assuming you have a field named name inside the table plant, this code will produce a row for each plant and the column named PlantName will contain the name of hte plant or the text "Not Found", ofc you can change the coalesce value to anything that fits your needs.

Alias a column name on a left join

Let's say I have two tables, and both their primary identifiers use the name 'id'. If I want to perform a join with these two tables, how would I alias the id of the table that I want to join with the former table?
For example:
SELECT * FROM `sites_indexed` LEFT JOIN `individual_data` ON `sites_indexed`.`id` = `individual_data`.`site_id` WHERE `url` LIKE :url
Now, site_id is supposed to link up with sites_indexed.id. The actual id which represents the row for individual_data however has the same title as sites_indexed.
Personally, I like to just use the name id for everything, as it keeps things consistent. When scripting server-side however, it can make things confusing.
e.g.
$var = $result['id'];
Given the aforementioned query, wouldn't this confuse the interpreter?
Anyway, how is this accomplished?
Instead of selecting all fields with "SELECT *" you should explicitly name each field you need, aliasing them with AS as required. For example:
SELECT si.field1 as si_field1,
si.field2 as si_field2,
ind_data.field1 as ind_data_field1
FROM sites_indexed as si
LEFT JOIN individual_data as ind_data
ON si.id = ind_data.site_id
WHERE `url` LIKE :url
And then you can reference the aliased names in your result set.
This thread is old and i found because i had the same problem. Now i have a better solution.
The answer given by Paul McNett and antun forces you to list all fields but in some cases this is impossible (too much fields to list), so you can keep the * and alias only the fields you want (typically the fields that have the same name and will override each other).
Here's how :
SELECT *, t.myfield as myNewName
FROM table t ... continue your query
you can add as much aliases as you want by adding comas.
Using this expression you will get results with columns id (from table sites_indexed) and id2 (alias for column id from table individual_data)
SELECT t1 . *, t2 . * FROM sites_indexed t1
LEFT JOIN (select id as id2, other_field1, other_field2 FROM individual_data) t2 ON t1.id = t2.site_id WHERE your_statement
The problem is that you're using the * wildcard. If you explicitly list the column names in your query, you can give them aliases:
SELECT `sites_indexed`.`id` AS `sites_indexed_id`,
`individual_data`.`id` AS `individual_data_id`
FROM `sites_indexed`
LEFT JOIN `individual_data` ON `sites_indexed`.`id` = `individual_data`.`site_id`
WHERE `url` LIKE :url
Then you can reference them via the alias:
$var = $result['sites_indexed_id'];
$var_b = $result['individual_data_id'];

Custom Order By + Group By

So I have a manual in this table:
id lang header_name text
1 uk Youth Development It's very important
2 dk tst hejsa
3 uk tst hello sir
And I want to make a query that fetches all manual entries for a given language (danish in this case). If for some reason not all 100% of the original manual entries (the UK ones), has been translated I want to get the english entry instead. Is that even possible in table formats such as this?
I guess it would be something with a "group by header_name" of some sorts, but not sure.
Try this, i dont have an SQL and hence this is not tested
The tables t1, t2, t3 refer to the same table use an alias to distinguish them;
select * from t3
where t3.lang IN ('DK','UK')
and t3.ID NOT IN
(select t1.id
FROM t1,t2
where t1.header_name = t2.header_name
AND t2.lang = 'DK'
AND t1.lang = 'UK'
)
Essentially first you need to find the ID that have translation, and then exclude them.
This might do the trick but it is not optimized:
SELECT *
FROM the_table
WHERE lang = 'dk'
UNION
SELECT *
FROM the_table
WHERE lang <> 'dk' AND header_name NOT IN (
SELECT header_name
FROM the_table
WHERE lang = 'dk'
)
I cant comment but all i want to ask is:
In the example you just put up, the rows with Id 2 and ID 3 are the same entries only different language?
Id say you pull it out and make two tables
Example
id
sort
(all other generic columns)
example_translations
id
example_id
language_id
header_name
text
Then if querying for the danish translation of an example with id 1 it'll return the example_translations row of this entity. if it returns nothing you can query for the english version.
I dont think it is possible to do something like this on Mysql level
The way i understand this, you want to get the english content if the danish content is missing?.. You might want to add a column to your table where you mark your entries. (dont know if your "header_name" column does that efectly for you, i'm guessing that as well will be translated?..
Anyway, a column named "entry_id" where "tst dk" and "tst uk" would both have id "2" for an example, you should then when you load you manual ask for the "entry_id" and first look for the dk entry, and if it's not there, load the uk entry.