SQL query to view EAV data as 3NF? (Drupal 6 profile values) - mysql

I have data stored in a MySQL database according to the Entity-Attribute-Value pattern (EAV), specifically user profile values from Drupal 6. I would need an SQL query or view to get the data as a normal relational table. The tables have the following layout:
Table: users
user_id username
---------------------
1 steve
2 michelle
Table: profile_fields
field_id field_name
------------------------
1 first_name
2 last_name
Table: profile_values
field_id user_id value
---------------------------
1 1 Steve
2 1 Smith
1 2 Michelle
2 2 Addams
And I would need to somehow get the following result from a query:
user_id first_name last_name
-----------------------------------
1 Steve Smith
2 Michelle Addams
I have understood this is impossible to do in a single SQL query in the general case. But this is not the general case, and I have two advantages:
I know the content of the "profile_fields" table, and I am 100% sure that this data will not change for the time period that this query will be used.
It doesn't have to be in a single query - it can be a query, some PHP code to analyze the results and then another query.

This can be done in a sql query using columnar subqueries as follows:
SELECT
u.user_id,
(select value from profile_values f1 WHERE f1.field_id=1 and u.user_id=f1.user_id) AS first_name,
(select value from profile_values f2 WHERE f2.field_id=2 and u.user_id=f2.user_id) AS last_name
FROM users u

Related

MySQL select delimited data

I have inherited a table with information about some groups of people in which one field which contains delimited data, with the results matched to another table.
id_group Name
-----------------------
1 2|4|5
2 3|4|6
3 1|2
And in another table I have a list of people who may belong to one or more groups
id_names Names
-----------------------
1 Jack
2 Joe
3 Fred
4 Mary
5 Bill
I would like to perform a select on the group data which results in a single field containing a comma or space delimited list of names such as this from the first group row above "Joe Fred Bill"
I have looked at using a function to split the delimited string, and also looked at sub queries, but concatenating the results of sub queries quickly becomes huge.
Thanks!
As implied by Strawberry's comment above, there is a way to do this, but it's so ugly. It's like finishing your expensive kitchen remodel using duct tape. You should feel resentment toward the person who designed the database this way.
SELECT g.id_group, GROUP_CONCAT(n.Names SEPARATOR ' ') AS Names
FROM groups AS g JOIN names AS n
ON FIND_IN_SET(n.id_names, REPLACE(g.Name, '|', ','))
GROUP BY g.id_group;
Output, tested on MySQL 5.6:
+----------+---------------+
| id_group | Names |
+----------+---------------+
| 1 | Joe Mary Bill |
| 2 | Fred Mary |
| 3 | Jack Joe |
+----------+---------------+
The complexity of this query, and the fact that it will be forced to do a table-scan and cannot be optimized, should convince you of what is wrong with storing a list of id's in a delimited string.
The better solution is to create a third table, in which you store each individual member of the group on a row by itself. That is, multiple rows per group.
CREATE TABLE group_name (
id_group INT NOT NULL,
id_name INT NOT NULL,
PRIMARY KEY (id_group, id_name)
);
Then you can query in a simpler way, and you have an opportunity to create indexes to make the query very fast.
SELECT id_group, GROUP_CONCAT(names SEPARATOR ' ') AS names
FROM groups
JOIN group_name USING (id_group)
JOIN names USING (id_name)
Shadow is correct. Your primary problem is the bad design of relations in the database. Typically one designs this kind of business problems as a so-called M:N relation (M to N). To accomplish that you need 3 tables:
first table is groups that has a GroupId field with primary key on it and a readable name field (e.g. 'group1' or whatever)
second table is people that looks exactly as you showed above. (do not forget to include a primary key in the PeopleId field also here)
third table is a bridge table called GroupMemberships. That one has 2 fields GroupId and PeopleId. This table connects the first two with each other and marks the M:N relation. One group can have 1 to N members and people can be members of 1 to M groups.
Finally, just join together the tables in the select and aggregate:
SELECT
g.Name,
GROUP_CONCAT(p.Name ORDER BY p.PeopleId DESC SEPARATOR ';') AS Members
FROM
Groups AS g
INNER JOIN GroupMemberships AS gm ON g.GroupId = gm.GroupId
INNER JOIN people AS p ON gm.PeopleId = p.PeopleId
GROUP BY g.Name;

Aggregate Text data using SQL

I have the following data:
Name | Condition
Mike | Good
Mike | Good
Steve | Good
Steve | Alright
Joe | Good
Joe | Bad
I want to write an if statement, if Bad exists, I want to classify the name as Bad. If Bad does not exist but Alright Exists, then classify as Alright. If only Good exists, then classify as good.
So my data would turn into:
Name | Condition
Mike | Good
Steve | Alright
Joe | Bad
Is this possible in SQL?
An Access query would be easy if you first create a table which maps Condition to a rank number.
Condition rank
--------- ----
Bad 1
Alright 2
Good 3
Then a GROUP BY query would give you the minimum rank for each Name:
SELECT y.Name, Min(c1.rank) AS MinOfrank
FROM
[YourTable] AS y
INNER JOIN conditions AS c1
ON y.Condition = c1.Condition
GROUP BY y.Name;
If you want to display the Condition string for those ranks, join back to the conditions table again:
SELECT sub.Name, sub.MinOfrank, c2.Condition
FROM
(
SELECT y.Name, Min(c1.rank) AS MinOfrank
FROM
[YourTable] AS y
INNER JOIN conditions AS c1
ON y.Condition = c1.Condition
GROUP BY y.Name
) AS sub
INNER JOIN conditions AS c2
ON sub.MinOfrank = c2.rank;
Performance should be fine with indexes on those conditions fields.
Seems to me this approach could also work in those other databases (MySQL and SQL Server) tagged in the question.
You can use a case statement to rank the conditions then max() or min() to summarize the results before returning them back to the user in the same format.
Query:
SELECT [Name]
, case min(case condition when 'bad' then 0 when 'alright' then 1 else 2 end)
when 0 then 'bad' when 1 then 'alright' when 2 then 'good' end as Condition
from mytable
group by [name]
mysql has an IF - function.
Here, have a look at it: https://dev.mysql.com/doc/refman/5.1/en/control-flow-functions.html#function_if

MySQL counting number of max groups

I asked a similar question earlier today, but I've run into another issue that I need assistance with.
I have a logging system that scans a server and catalogs every user that's online at that given moment. Here is how my table looks like:
-----------------
| ab_logs |
-----------------
| id |
| scan_id |
| found_user |
-----------------
id is an autoincrementing primary key. Has no real value other than that.
scan_id is an integer that is incremented after each successful scan of all users. It so I can separate results from different scans.
found_user. Stores which user was found online during the scan.
The above will generate a table that could look like this:
id | scan_id | found_user
----------------------------
1 | 1 | Nick
2 | 2 | Nick
3 | 2 | John
4 | 3 | John
So on the first scan the system found only Nick online. On the 2nd it found both Nick and John. On the 3rd only John was still online.
My problem is that I want to get the total amount of unique users connected to the server at the time of each scan. In other words, I want the aggregate number of users that have connected at each scan. Think counter.
From the example above, the result I want from the sql is:
1
2
2
EDIT:
This is what I have tried so far, but it's wrong:
SELECT COUNT(DISTINCT(found_user)) FROM ab_logs WHERE DATE(timestamp) = CURDATE() GROUP BY scan_id
What I tried returns this:
1
2
1
The code below should give you the results you are looking for
select s.scan_id, count(*) from
(select distinct
t.scan_id
,t1.found_user
from
tblScans t
inner join tblScans t1 on t.scan_id >= t1.scan_id) s
group by
s.scan_id;
Here is sqlFiddle
It assumes the names are unique and includes current and every previous scans in the count
Try with group by clause:
SELECT scan_id, count(*)
FROM mytable
GROUP BY scan_id

MySQL query get column value similar to given

Sorry if my question seems unclear, I'll try to explain.
I have a column in a row, for example /1/3/5/8/42/239/, let's say I would like to find a similar one where there is as many corresponding "ids" as possible.
Example:
| My Column |
#1 | /1/3/7/2/4/ |
#2 | /1/5/7/2/4/ |
#3 | /1/3/6/8/4/ |
Now, by running the query on #1 I would like to get row #2 as it's the most similar. Is there any way to do it or it's just my fantasy? Thanks for your time.
EDIT:
As suggested I'm expanding my question. This column represents favourite artist of an user from a music site. I'm searching them like thisMyColumn LIKE '%/ID/%' and remove by replacing /ID/ with /
Since you did not provice really much info about your data I have to fill the gaps with my guesses.
So you have a users table
users table
-----------
id
name
other_stuff
And you like to store which artists are favorites of a user. So you must have an artists table
artists table
-------------
id
name
other_stuff
And to relate you can add another table called favorites
favorites table
---------------
user_id
artist_id
In that table you add a record for every artist that a user likes.
Example data
users
id | name
1 | tom
2 | john
artists
id | name
1 | michael jackson
2 | madonna
3 | deep purple
favorites
user_id | artist_id
1 | 1
1 | 3
2 | 2
To select the favorites of user tom for instance you can do
select a.name
from artists a
join favorites f on f.artist_id = a.id
join users u on f.user_id = u.id
where u.name = 'tom'
And if you add proper indexing to your table then this is really fast!
Problem is you're storing this in a really, really awkward way.
I'm guessing you have to deal with an arbitrary number of values. You have two options:
Store the multiple ID's in a blob object in JSON format. While MySQL doesn't have JSON functions built in, there are user defined functions that will extract values for you, etc.
See: http://blog.ulf-wendel.de/2013/mysql-5-7-sql-functions-for-json-udf/
Alternatively, switch to PostGres
Add as many columns to your table as the maximum number of ID's you expect to have. So if /1/3/7/2/4/8/ is the longest entry, have 6 columns in your table. Reason this is bad: you'll have sparse columns that'll unnecessarily slow your tables.
I'm sure you could write some horrific regex to accomplish the task, but I caution on using complex regex's on enormous tables.

best way to generate reports on table

the question is :
i have a table that contains details, this table is used by users when they registered or update there profile or participate in different exams.
The report I need will have some calculation like aggregate scores .
I would to as if it is better to create new table witch includes the report i need or it's better to work on the same table.
Are you able to provide any further details? What fields are available in the table that you want to query? How do you want to display this information? On a website? For a report?
From what you describe, you need two tables. One table (lets call is 'users') would contain information about each user, and the other would contain the actual exam scores (lets call this table 'results' ).
Each person in the 'user' table has a unique ID number (I'll call it UID) to identify them, and each score in the 'results' table also has the UID of person the score relates to. By including the UID of the user in the 'results' table you can link an infinite number of results (known as a one-to-many relationship).
The 'user' table could look like this:
userUID (UID for each person) | Name | User Details
1 | Barack Obama | President
2 | George Bush | Ex-President
The 'results' table could look like this:
UID for each exam | userUID (UID of the person who look the test) | Score
1 | 1 | 85
2 | 2 | 40
3 | 1 | 82
4 | 2 | 25
I always like to add a UID for things like the exam because it allows you to easily find a specific exam result.
Anyway... a query to get all of the results for Barack Obama would look like this:
SELECT Score From 'results' WHERE userUID = 1
To get results for George Bush, you just change the userUID to 2. You would obviously need to know the UID of the user (userUID) before you ran this query.
Please note that these are VERY basic examples (involving fictional characters ;) ). You could easily add an aggregated score field to the 'user' table and update that each time you add a new result to the 'results' table. Depending upon how your code is set up this could save you a query.
Good luck - Hopefully this helps!