Multiple data stored into a field for better management? - mysql

I have a MySQL table as follow:
id, user, culprit, reason, status, ts_register, ts_update
I was thinking of using the reason as an int field and store just the id of the reason that could be selected by the user and the reason itself could be increased by the admin.
What I meant by increased is that the admin could register new reason, for example currently we have:
Flood, Racism, Hacks, Other
But the admin could add a new reason for instance:
Refund
Now my problem is that I would like to allow my users to select multiple reasons, for example:
The report 01 have the reasons Flood and Hack.
How should I store the reason field so that I could select multiple reasons while maintaining a good table format?
Should I just go ahead and store it as a string and cast it as an INT when I am searching thru it or there are better forms to store it?
UPDATE Based on Jonathan's reply:
SELECT mt.*, group_concat(r.reason separator ', ') AS reason
FROM MainTable AS mt
JOIN MainReason AS mr ON mt.id = mr.maintable_ID
JOIN Reasons AS r ON mr.reason = r.reason_id
GROUP BY mt.id

The normalized solution is to have a second table containing one row for each reason:
CREATE TABLE MainReasons
(
MainTable_ID INTEGER NOT NULL REFERENCES MainTable(ID),
Reason INTEGER NOT NULL REFERENCES Reasons(ID),
PRIMARY KEY(MainTable_ID, Reason)
);
(Assuming your main table is called MainTable and you have a table defining valid reason codes called Reasons.)
From a comment:
[W]ould you be [so] kind [as] to show me an example of selecting something to retrieve a report's reason? I mean if I simple select it SELECT * FROM MainTABLE I would never get any reasons since MainTable doesnt know it right? Because it is only linked to the MainReasons and Reasons table so I would need to do something like SELECT * FROM MainTable LEFT JOIN MainReasons USING (MainTable_ID) or something alike but how would I go about getting all the reasons if multiples?
SELECT mt.*, r.reason
FROM MainTable AS mt
JOIN MainReason AS mr ON mt.id = mr.maintable_ID
JOIN Reasons AS r ON mr.reason = r.reason_id
This will return one row per reason - so it would return multiple rows for a single report (recorded in what I called MainTable). I omitted the reason ID number from the results - you can include it if you wish.
You can add criteria to the query, adding terms to a WHERE clause. If you want to see the reports where a specific reason is specified:
SELECT mt.*
FROM MainTable AS mt
JOIN MainReason AS mr ON mt.id = mr.maintable_ID
JOIN Reasons AS r ON mr.reason = r.reason_id
WHERE r.reason = 'Flood'
(You don't need the reason in the results - you know what it is.)
If you want to see the reports where 'Floods' and 'Hacks' were the reasons given, then you can write:
SELECT mt.*
FROM MainTable AS mt
JOIN (SELECT f.MainTable_ID
FROM (SELECT MainTable_ID
FROM MainReason AS mr1
JOIN Reasons AS r1 ON mr1.reason = r1.reason_ID
WHERE r1.reason = 'Floods'
) AS f ON f.MainTable_ID = mt.MainTable_ID
JOIN (SELECT f.MainTable_ID
FROM (SELECT MainTable_ID
FROM MainReason AS mr2
JOIN Reasons AS r2 ON mr2.reason = r2.reason_ID
WHERE r1.reason = 'Hacks'
) AS h ON h.MainTable_ID = mt.MainTable_ID

To do a one-to-many relationship, I would spin reason off into it's own table, like so:
id, parent_id, reason
parent_id would refer back into your current table's id.
You could store it as an INT, but it would be a continual pain to have to parse it every time you wanted to read the data. This way would just take one more join.

Related

MYSQL - SubSelect when FK does and doesnt exists

Situation Overview
The current question is a problem about selecting values from two tables table A (material) and table B (MaterialRevision). However, The PK of table A might or Might not exist in Table B. When it doesnt exists, the query described in this question wont return the values of table A, but IT SHOULD. So basically here's whats happening :
The query is only returning values when A.id exists in B.id, when In fact, I need it to return values from A when A.id ALSO dont exist in B.id.
Problem:
Suppose two tables. Table Material and Table Material Revision.
Notice that the PK idMaterial is a FK in MaterialRevision.
Current "Mock" Tables
Query Objective
Obs: remember these two tables are a simplification of the real
tables.
For each Material, print the material variables and the last(MAX) RevisionDate from MaterialRevision. In case theres no RevisionDate, print BLANK ("") for the "last revision date".
What is wrongly happening
For each Material, print the material variables and the last(MAX) RevisionDate from MaterialRevision. In case theres no Revision for the Material, doesnt print the Material (SKIP).
Current Code
SELECT
Material.idMaterial,
Material.nextRevisionDate,
Material.obsolete,
lastRevisionDate
FROM Material,
(SELECT MaterialRevision.idMaterial, max(MaterialRevision.revisionDate) as "revisionDate" from MaterialRevision
GROUP BY MaterialRevision.idMaterial
) AS Revision
WHERE (Material.idMaterial = Revision.idMaterial AND Material.obsolete = 0)
References and Links used to reach the state described in this question
Why is MAX() 100 times slower than ORDER BY ... LIMIT 1?
MySQL get last date records from multiple
MySQL - How to SELECT based on value of another SELECT
MySQL Query Select where id does not exist in the JOIN table
PS I hope this question is correctly understood since it took me a lot of time to build it. I researched a lot in stackoverflow and after
several failed attempts I had no option but to ask for help.
You should use JOIN :
SELECT m.idMaterial, m.nextRevisionDate, mr.revisionDate AS "lastRevisionDate"
FROM Material m
LEFT JOIN MaterialRevision AS mr ON mr.idMaterial = m.idMaterial AND mr.revisionDate = (
SELECT MAX(ch.revisionDate) from MaterialRevision ch
WHERE mr.idMaterial = ch.idMaterial)
WHERE m.obsolete = 0
Here is an explanation of what INNER JOIN, LEFT JOIN and RIGHT JOIN are. (You will love them if you often cross tables in your queries)
As m.obsolete will always be true, I ommited it in the SELECT clause
You should use the left outer join instead of using the cross product.
You're query should be something like this:
SELECT idMaterial, nextRevisionableDate, obsolete,
revisionDate AS lastRevisionDate
FROM Material
LEFT OUTER JOIN MaterialRevision AS mr On
Material.idMaterial = MaterialRevision.id
AND mr.revisionDate = (SELECT MAX(ch.revisionDate) from MaterialRevision ch
WHERE mr.idMaterial = ch.idMaterial)
WHERE obsolete = 0;
Here you can find some documentation about types of join.

How can I filter out results based on another table. (A reverse join I guess?)

Basically, I have a table which contains two fields: [id, other] which have user tokens stored in them. The goal of my query is to select a random user that has not been selected before. Once the user is selected it is stored in the table shown above. So if Jack selects Jim randomly, Jack cannot select Jim again, and on the flip side, Jim cannot select Jack.
Something like this is what comes to mind:
SELECT * FROM users
WHERE (SELECT * FROM selected WHERE (id=? AND other=?) OR (id=? AND other=?));
Well, first of all I've read that uses sub-queries like this is extremely inneficient, and I'm not even sure if I used the correct syntax, the problem is however, that I have numerous tables in my scenario which I need to filter by, so it would look more like this.
SELECT * FROM users u
WHERE (SELECT * FROM selected WHERE (id=? AND other=?) OR (id=? AND other=?))
AND (SELECT * FROM other_table WHERE (id=? AND other=?) OR (id=? AND other=?))
AND (SELECT * FROM diff_table WHERE (id=? AND value=?))
AND u.type = 'BASIC'
LIMIT = 1
I feel like there's a much, much more efficient way of handling this.
Please note: I don't want a row returned at all if the users id is present in any of the nested queries. Returning "null" is not sufficient. The reason I have the OR clause is because the user's id can be stored in either the id or the other field, so we need to check both.
I am using Postgre 9.5.3, but I added the MySQL tag as the code is mostly backwards comptable, Fancy Postgre only solutions are accepted(if any)
You can left join to another table, which produces nulls where no record is found:
Select u.* from users u
left selected s on s.id = u.id or s.other = u.other
where s.id is null
The or in a join is different, but should work. Example is kinda silly...but as long as you understand the logic. Left join first table to second table, where second table column is not null means there was atleast one record found that matched the join conditions. Where second table column is null means no record was found.
And you are right...avoid the where field = (select statement) logic when you can, poor performer there.
Use an outer join filtered on missed joins:
SELECT * FROM users u
LEFT JOIN selected s on u.id in (s.id, s.other) and ? in (s.id, s.other)
WHERE u.id != ?
AND s.id IN NULL
LIMIT 1

Need to find all client id's the most efficient/fastest way using mySql

I have a bridging table that looks like this
clients_user_groups
id = int
client_id = int
group_id = int
I need to find all client_id's of of clients that belong to the same group as client_id 46l
I can achieve it doing a query as below which produces the correct results
SELECT client_id FROM clients_user_groups WHERE group_id = (SELECT group_id FROM clients_user_groups WHERE client_id = 46);
Basically what I need to find out is if there's a way achieving the same results without using 2 queries or a faster way, or is the method above the best solution
You're using a WHERE-clause subquery which, in MySQL, ends up being reevaluated for every single row in your table. Use a JOIN instead:
SELECT a.client_id
FROM clients_user_groups a
JOIN clients_user_groups b ON b.client_id = 46
AND a.group_id = b.group_id
Since you plan on facilitating clients having more than one group in the future, you might want to add DISTINCT to the SELECT so that multiple of the same client_ids aren't returned when you do switch (as a result of the client being in more than one of client_id 46's groups).
If you haven't done so already, create the following composite index on:
(client_id, group_id)
With client_id at the first position in the index since it most likely offers the best initial selectivity. Also, if you've got a substantial amount of rows in your table, ensure that the index is being utilized with EXPLAIN.
you can try with a self join also
SELECT a.client_id
FROM clients_user_groups a
LEFT JOIN clients_user_groups b on b.client_id=46
Where b.group_id=a.group_id
set #groupID = (SELECT group_id FROM clients_user_groups WHERE client_id = 46);
SELECT client_id FROM clients_user_groups WHERE group_id = #groupID;
You will have a query which gets the group ID and you store it into a variable. After this you select the client_id values where the group_id matches the value stored in your variable. You can speed up this query even more if you define an index for clients_user_groups.group_id.
Note1: I didn't test my code, hopefully there are no typos, but you've got the idea I think.
Note2: This should be done in a single request, because DB requests are very expensive if we look at the needed time.
Based on your comment that each client can only belong to one group, I would suggest a schema change to place the group_id relation into the client table as a field. Typically, one would use the sort of JOIN table you have described to express many-to-many relationships within a relational database (i.e. clients could belong to many groups and groups could have many clients).
In such a scenario, the query would be made without the need for a sub-select like this:
SELECT c.client_id
FROM clients as c
INNER JOIN clients as c2 ON c.group_id = c2.group_id
WHERE c2.client_id = ?

Query efficiency (multiple selects)

I have two tables - one called customer_records and another called customer_actions.
customer_records has the following schema:
CustomerID (auto increment, primary key)
CustomerName
...etc...
customer_actions has the following schema:
ActionID (auto increment, primary key)
CustomerID (relates to customer_records)
ActionType
ActionTime (UNIX time stamp that the entry was made)
Note (TEXT type)
Every time a user carries out an action on a customer record, an entry is made in customer_actions, and the user is given the opportunity to enter a note. ActionType can be one of a few values (like 'designatory update' or 'added case info' - can only be one of a list of options).
What I want to be able to do is display a list of records from customer_records where the last ActionType was a certain value.
So far, I've searched the net/SO and come up with this monster:
SELECT * FROM (
SELECT * FROM (
SELECT * FROM `customer_actions` ORDER BY `EntryID` DESC
) list1 GROUP BY `CustomerID`
) list2 WHERE `ActionType`='whatever' LIMIT 0,30
Which is great - it lists each customer ID and their last action. But the query is extremely slow on occasions (note: there are nearly 20,000 records in customer_records). Can anyone offer any tips on how I can sort this monster of a query out or adjust my table to give faster results? I'm using MySQL. Any help is really appreciated, thanks.
Edit: To be clear, I need to see a list of customers who's last action was 'whatever'.
To filter customers by their last action, you could use a correlated sub-query...
SELECT
*
FROM
customer_records
INNER JOIN
customer_actions
ON customer_actions.CustomerID = customer_records.CustomerID
AND customer_actions.ActionDate = (
SELECT
MAX(ActionDate)
FROM
customer_actions AS lookup
WHERE
CustomerID = customer_records.CustomerID
)
WHERE
customer_actions.ActionType = 'Whatever'
You may find it more efficient to avoid the correlated sub-query as follows...
SELECT
*
FROM
customer_records
INNER JOIN
(SELECT CustomerID, MAX(ActionDate) AS ActionDate FROM customer_actions GROUP BY CustomerID) AS last_action
ON customer_records.CustomerID = last_action.CustomerID
INNER JOIN
customer_actions
ON customer_actions.CustomerID = last_action.CustomerID
AND customer_actions.ActionDate = last_action.ActionDate
WHERE
customer_actions.ActionType = 'Whatever'
I'm not sure if I understand the requirements but it looks to me like a JOIN would be enough for that.
SELECT cr.CustomerID, cr.CustomerName, ...
FROM customer_records cr
INNER JOIN customer_actions ca ON ca.CustomerID = cr.CustomerID
WHERE `ActionType` = 'whatever'
ORDER BY
ca.EntryID
Note that 20.000 records should not pose a performance problem
Please note that I've adapted Lieven's answer (I made a separate post as this was too long for a comment). Any credit for the solution itself goes to him, I'm just trying to show you some key points for improving performance.
If speed is a concern then the following should give you some suggestions for improving it:
select top 100 -- Change as required
cr.CustomerID ,
cr.CustomerName,
cr.MoreDetail1,
cr.Etc
from customer_records cr
inner join customer_actions ca
on ca.CustomerID = cr.CustomerID
where ca.ActionType = 'x'
order by cr.CustomerID
A few notes:
In some cases I find left outer joins to be faster then inner joins - It would be worth measuring performance for both for this query
Avoid returning * wherever possible
You don't have to reference 'cr.x' in the initial select but it's a good habit to get into for when you start working on large queries that can have multiple joins in them (this will make a lot of sense once you start doing this
When using joins always join on a primary key
Maybe I'm missing something but what's wrong with a simple join and a where clause?
Select ActionType, ActionTime, Note
FROM Customer_Records CR
INNER JOIN customer_Actions CA
ON CR.CustomerID = CA.CustomerID
Where ActionType = 'added case info'

Explain SQL and Query optimization

Explain SQL (in phpmyadmin) of a query that is taking more than 5 seconds is giving me the above. I read that we can study the Explain SQL to optimize a query. Can anyone tell if this Explain SQL telling anything as such?
Thanks guys.
Edit:
The query itself:
SELECT
a.`depart` , a.user,
m.civ, m.prenom, m.nom,
CAST( GROUP_CONCAT( DISTINCT concat( c.id, '~', c.prenom, ' ', c.nom ) ) AS char ) AS coordinateur,
z.dr
FROM `0_activite` AS a
JOIN `0_member` AS m ON a.user = m.id
LEFT JOIN `0_depart` AS d ON ( m.depart = d.depart AND d.rank = 'mod' AND d.user_sec =2 )
LEFT JOIN `0_member` AS c ON d.user_id = c.id
LEFT JOIN `zone_base` AS z ON m.depart = z.deprt_num
GROUP BY a.user
Edit 2:
Structures of the two tables a and d. Top: a and bottom: d
Edit 3:
What I want in this query?
I first want to get the value of 'depart' and 'user' (which is an id) from the table 0_activite. Next, I want to get name of the person (civ, prenom and name) from 0_member whose id I am getting from 0_activite via 'user', by matching 0_activite.user with 0_member.id. Here depart is short of department which is also an id.
So at this point, I have depart, id, civ, nom and prenom of a person from two tables, 0_activite and 0_member.
Next, I want to know which dr is related with this depart, and this I get from zone_base. The value of depart is same in both 0_activite and 0_member.
Then comes the trickier part. A person from 0_member can be associated with multiple departs and this is stored in 0_depart. Also, every user has a level, one of what is 'mod', stands for moderator. Now I want to get all the people who are moderators in the depart from where the first user is, and then get those moderaor's name from 0_member again. I also have a variable user_sec, but this is probably less important in this context, though I cannot overlook it.
This is what makes the query a tricky one. 0_member is storing id, name of users, + one depart, 0_depart is storing all departs of users, one line for each depart, and 0_activite is storing some other stuffs and I want to relate those through userid of 0_activite and the rest.
Hope I have been clear. If I am not, please let me know and I will try again to edit this post.
Many many thanks again.
Aside from the few answers provided by the others here, it might help to better understand the "what do I want" from the query. As you've accepted a rather recent answer from me in another of your questions, you have filters applied by department information.
Your query is doing a LEFT join at the Department table by rank = 'mod' and user_sec = 2. Is your overall intent to show ALL records in the 0_activite table REGARDLESS of a valid join to the 0_Depart table... and if there IS a match to the 0_Depart table, you only care about the 'mod' and 2 values?
If you only care about those people specifically associated with the 0_depart with 'mod' and 2 conditions, I would reverse the query starting with THIS table first, then join to the rest.
Having keys on tables via relationship or criteria is always a performance benefit (vs not having the indexes).
Start your query with whatever would be your smallest set FIRST, then join to other tables.
From clarification in your question... I would start with the inner-most... Who it is and what departments are they associated with... THEN get the moderators (from department where condition)... Then get actual moderator's name info... and finally out to your zone_base for the dr based on the department of the MODERATOR...
select STRAIGHT_JOIN
DeptPerMember.*
Moderator.Civ as ModCiv,
Moderator.Prenom as ModPrenom,
Moderator.Nom as ModNom,
z.dr
from
( select
m.ID,
m.Depart,
m.Civ,
m.Prenom,
m.Nom
from
0_Activite as a
join 0_member m
on a.User = m.ID
join 0_Depart as d
on m.depart = d.depart ) DeptPerMember
join 0_Depart as DeptForMod
on DeptPerMember.Depart = DeptForMod.Depart
and DeptForMod.rank = 'mod'
and DeptForMod.user_sec = 2
join 0_Member as Moderator
on DeptForMod.user_id = Moderator.ID
join zone_base z
on Moderator.depart = z.deprt_num
Notice how I tier'd the query to get each part and joined to the next and next and next. I'm building the chain based on the results of the previous with clear "alias" references for clarification of content. Now, you can get whatever respective elements from any of the levels via their distinct "alias" references...
The output from EXPLAIN is showing us that the first and third tables listed (a & d) are not having any indexes utilised by the database engine in executing this query. The key column is NULL for both - which is a shame since both are 'large' tables (OK, they're not really large, but compared to the rest of the tables they're the big 'uns).
Judging from the query, an index on user on 0_activite and an index on (depart, rank, user_sec) on 0_depart would go some way to improving performance.
you can see that columns key and key_len are null this means its not using any key in the possible_keys column. So table a and d are both scanning all rows. (check larger numbers in rows column. you want this smaller).
To deal with 0_depart:
Make sure you have a key on (d.depart, d.rank,d.user_sec) which are part of the join of 0_depart.
To deal with 0_activite:
I'm not positive but a GROUP column should be indexed too so you need a key on a.user