MySQL: How to find a value that may exist across multiple columns - mysql

I currently have a table that stores rankings of music sales in a shop:
|-date-------|-rank_1---|-rank_2---|-...
| 2015-06-30 | 112 | 145 | ...
| 2015-07-31 | 145 | 147 | ...
| ...
| ...
Each number in the rank_# column is a foreign key that references an album in a separate table:
|-album_id---|-album_name----|-...
| 112 | An Album | ...
| 145 | Another Album | ...
| ...
I want to implement a feature where I can search for an album and see its ranking across the dates. However, the album_id can show up in any of the rank_# columns and I'd like to know if there was any way that I could "invert" the tables so I get a result like:
SELECT * FROM table WHERE ....
=> |-date-------|-column-----|
| 2015-06-30 | rank_2 |
| 2015-07-31 | rank_1 |
| ...
Now, the brute-force method I can think of is just to loop through the table and look at each cell in the table, but seeing as how the table is quite large, I was wondering if there was a more efficient method of doing this.

Thanks for the help everyone! It seems like I was structuring the tables incorrectly and instead should have set it up like:
|-date-------|-album_id-|-rank--|
| 2015-06-30 | 112 | 1 |
| 2015-07-31 | 145 | 2 |
| ...
| ...

Wolfgang, here's my approach:
Albums Table:
album_id;album_name
1;ska
2;psychobilly
3;punk
4;nu-metal
Rankings Table:
id;date;rank_1;rank_2
1;2016-08-01;1;2
2;2016-08-02;2;1
4;2016-08-03;2;4
Suggested Query:
select r.date,a.album_id,a.album_name,r.rank_1,r.rank_2,
case
when a.album_id= r.rank_1 then "rank_1"
when a.album_id= r.rank_2 then "rank_2"
end as "rank"
from albums a
inner join rankings r on (r.rank_1=a.album_id or r.rank_2=a.album_id)
where a.album_name like '%psychobilly%'
Results:
date;album_id;album_name;rank_1;rank_2;rank
2016-08-01;2;psychobilly;1;2;rank_2
2016-08-02;2;psychobilly;2;1;rank_1
2016-08-03;2;psychobilly;2;4;rank_1
Explanation:
The last column of the query will contain the position in the ranking, according to the value of the album id, between "rank_1" and "rank_2" columns.
Feel free to try, this may help you...

Related

Database design for 150 million records p.a. with categories and sub categories

I need some help for a MySQL database design. The MySQL database should handle about 150 million records a year. I want to use the myisam engine.
The data structure:
Car brand (>500 brands)
Every car brand has 30+ car models
Every car model has the same 5 values, some model have additional values
Every value has exactly 3 fields:
timestamp
quality
actual value
The car brand can have some values with the same fields
The values are tracked every 5 minutes -> 105120 records a year
About the data:
The field quality should be always 'good' but when it's not I need to know.
The field timestamp is usually the but at least one value has a different timestamp
Deviation: 1-60 seconds
If the timestamp has a different timestamp it has always a different timestamp
Sometimes I don't get data because the source server is down.
How I want to use the data for
Visualisations in chart(time and actual value) with a selection of values
Aggregation of some values for every brand
My Questions:
I thought it's a good idea to split the data into different tables, so I put every brand in an extra table. To find the table by car brand name I created an index table. Is this a good practice?
Is it better to create tables for every car model (about 1500 tables)?
Should I store the quality (if it is not 'good') and the deviation of the timestamp in a seperate table?
Any other suggestions?
Example:
Table: car_brand
| car_brand | tablename | Address |
|-----------|-----------|-------------|
| BMW | bmw_table | the address |
| ... | ... | ... |
Table: bmw_table (105120*30+ car models = more than 3,2 million records per year)
| car_model | timestamp_usage | quality_usage | usage | timestamp_fuel_consumed | quality_usage |fuel_consumed | timestamp_fuel_consumed | quality_kilometer | kilometer | timestamp_revenue | quality_revenue | revenue | ... |
|-------------|---------------------|---------------|-------|-------------------------|----------------|--------------|-------------------------|-------------------|-----------|---------------------|-----------------|---------|-----|
| Z4 | 2015-12-12 12:12:12 | good | 5% | 2015-12-12 12:12:12 | good | 10.6 | 2015-12-12 12:11:54 | good | 120 | null | null | null | ... |
| Z4 | 2015-12-12 12:17:12 | good | 6% | 2015-12-12 12:17:12 | good | 12.6 | 2015-12-12 12:16:54 | good | 125 | null | null | null | ... |
| brand_value | null |null | null | null | null | null | null | null | null | 2015-12-12 12:17:12 | good | 1000 | ... |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
And the other brand tables..
Edit: Queries and quality added
Possible Queries
Note: I assume that the table bmw_table has an extra column that is called car_brand and the table name is simple_table instead of bmw_table to reduce complexity.
SELECT car_brand, sum(revenue), avg(usage)
FROM simple_table
WHERE timestamp_usage>=2015-10-01 00:00:00 AND timestamp_usage>=2015-10-31 23:59:59
GROUP BY car_brand;
SELECT timestamp_usage,usage,revenue,fuel_consumed,kilometer
FROM simple_table
WHERE timestamp_usage>=2015-10-01 00:00:00 AND timestamp_usage>=2015-10-31 23:59:59;
Quality Values
I collect the data from an OPC Server so the qualtiy field contains one of the following values:
bad
badConfigurationError
badNotConnected
badDeviceFailure
badSensorFailure
badLastKnownValue
badCommFailure
badOutOfService
badWaitingForInitialData
uncertain
uncertainLastUsableValue
uncertainSensorNotAccurate
uncertainEUExceeded
uncertainSubNormal
good
goodLocalOverride
Thanks in advance!
Droider
Do not have a separate table per brand. There is no advantage, only unnecessary complexity. Nor 1 table per model. In general, if two table look the same, the data should be combined into a single table. In your example, that one table would have brand and model as columns.
Indexes are your friend for performance. Let's see the queries you will perform, so we can discuss the optimal indexes.
What will you do if the data quality is not 'good'? Simply display "good" or "not good"?

Split SQL field containing array into new table/rows

I need a list of user IDs (course_user_ids) that is currently stored in a single field of a larger table.
I have a table called courses that contains course information with course_id and course_students as such:
-----------------------------------------------------------
| course_id | course_students |
----------------------------------------------------------
| 1 | a:3:{i:0;i:12345;i:1;i:22345;i:2;i:323456;} |
-----------------------------------------------------------
| 2 | a:32:{ … } |
-----------------------------------------------------------
The course_students part contains 3 chunks of information:
the number of students (a:3:{…) -- not needed
the order/key for the array of each student ({i:0;… i:1;… i:2; …}) -- also not needed
the course_user_id (i:12345; … i:22345;… i:32345;)
I only need the course_user_id and the original course_id, resulting in a new table that i can use for joins/subqueries like this:
------------------------------
| course_id | course_user_id |
------------------------------
| 1 | 12345 |
------------------------------
| 1 | 22345 |
------------------------------
| 1 | 323456 |
------------------------------
(ideally able to continue to break out values for other course_ids and course_user_ids, but not a priority:)
| … | … |
------------------------------
| 2 | … |
------------------------------
| 2 | … |
------------------------------
| 97 | … |
------------------------------
| 97 | … |
------------------------------
| … | … |
------------------------------
Note: the course_user_id can vary in length (some are 5 digits, some are 6)
Any ideas would be much appreciated!
Update
My user table does have user_id which can be mapped to course_students or course_user_id, so that is a very helpful observation from below.
I also think I need to use a LEFT JOIN because some students are registered in multiple courses, and I'd like to see each instance/combo.
Let us assume that you have a table name users which contains all users data along with user_id.
Now you can join table courses and table users in following manner:
select c.course_id,u.user_id
from
courses c
join users u
on u.user_id=if(instr(c.course_students,concat(":",u.user_id,";"))>0,u.user_id,c.course_students)
You get the result as per your requirement.
Verify at http://sqlfiddle.com/#!9/3667d/2
Note: The above query works fine if no overlapping between user_id and array index. In case of overlapping, kindly filter data using where-clause
If I got your goal correctly you have users table. And {i:0;i:12345;i:1;i:22345;i:2;i:323456; equal users.id=12345,users.id=22345 etc.
If my guess is correct you can try this solution:
http://sqlfiddle.com/#!9/cfef27/5
SELECT * FROM courses
LEFT JOIN users u
ON courses.course_students LIKE CONCAT('%i:',u.id,';%')

Join multiple tables with same column name

I have these tables in my MySQL database:
General table:
+----generalTable-----+
+---------------------+
| id | scenario | ... |
+----+----------+-----+
| 1 | facebook | ... |
| 2 | chief | ... |
| 3 | facebook | ... |
| 4 | chief | ... |
Facebook Table:
+----facebookTable-----+
+----------------------+
| id | expiresAt | ... |
+----+-----------+-----+
| 1 | 12345678 | ... |
| 3 | 45832458 | ... |
Chief Table:
+------chiefTable------+
+----------------------+
| id | expiresAt | ... |
+----+-----------+-----+
| 2 | 43547343 | ... |
| 4 | 23443355 | ... |
Basically, the general table holds some (obviously) general data. Based on the generalTable.scenario you can look up more details in the other two tables, which are in some columns familiar (expiresAt for example) but in others not.
My question is, how to get the joined data of generalTable and the right detailed table in just one query.
So, I would like a query like this:
SELECT id, scenario, expiresAt
FROM generalTable
JOIN facebookTable
ON generalTable.id = facebookTable.id
JOIN chiefTable
ON generalTable.id = chiefTable.id
And an output like this:
| id | scenario | expiresAt |
+----+----------+-----------+
| 1 | facebook | 12345678 |
| 2 | chief | 43547343 |
| 3 | facebook | 45832458 |
| 4 | chief | 23443355 |
However, this doesn't work, because both facebookTable and chiefTable have ambiguous column name "expiresAt". For the ease of use I want to keep it that way. The result table should also only have one column "expiresAt" that is automatically filled with the right values from either facebookTable or chiefTable.
You might want to consider adding expiredAt to your general table, and removing it from the others, to remove duplication in the schema, and to make this particular query simpler.
If you need to stick with your current schema, you can use table aliases to resolve the name ambiguity, and use two joins and a union to create the result you are looking for:
SELECT g.id, g.scenario, f.expiresAt
FROM generalTable g
JOIN facebookTable f
ON g.id = f.id
UNION ALL
SELECT g.id, g.scenario, c.expiresAt
FROM generalTable g
JOIN chiefTable c
ON g.id = c.id;
The outer join approach mentioned in another answer would also solve the problem.
One way you could accomplish it is with LEFT JOIN. In the result fields you can do something like this for common fields IF(fTbl.id IS NULL, cTbl.expiresAt, fTbl.expiresAt) AS expiresAt.

Structuring a MySQL database for user information

I am quite new to MySQL, I know most of the basic functions and how to send queries etc. However, I am trying to learn about structuring it for optimal searches for user information and wanted to get some ideas.
Right now I just have one table (for functionality purposes and testing) called user_info which holds the users information and another table that stores photos linked to the user. Ideally id like most of this information to be as quickly as accessible as possible
In creating a database which is primarily used to store and retrieve user information (name, age, phone, messages, etc.) would it be a good idea to create a NEW TABLE for each new user that stores all the information so the one table user_info does not become bogged down by multiple queries, locking, etc. So for example user john smith would have his very own table in the database holding all his information including photos, messages etc.
OR
is it better to have just a few tables such as user_info, user_photos, user_messages,etc. and accessing data in this manner.
I am not concerned about redundancy in the tables such as the users email address being repeated multiple times.
The latter is the best way. You declare one table for users, and several columns with the data you want.
Now if you want users to have photos, you'd require a new table with photos and a Foreign Key attribute that links to the user table's Primary Key.
You should definitely NOT create a new table for each user. Create one table for user_info, one for photos if each user can have many photos. A messages table would probably contain two user_id columns (user_to, user_from) and a message column. Try to normalize the data as much as possible.
Users
====
id
email
etc
Photos
====
id
user_id
meta_data
etc
Messages
====
id
user_id_to
user_id_from
message
timestamp
etc
I agree with both the answers supplied here, but one thing they haven't mentioned yet is lookup tables.
Going with the general examples here consider this: you have a users table, and a photos table. Now you want to introduce a featre on your site that allows users to "Favorite" photos from other users.
Rather than making a new table called "Favorites" and adding in all your data about the image (fiel location, metadata, score/whatever) all over again, have a table that effectively sits BETWEEN the other two.
+-----------------------+ +-------------------------------------+
| ++ users | | ++ photos |
| userID | email | name | | photoID | ownerID | fileLo | etc... |
+--------+-------+------| +---------+---------+--------+--------+
| 1 | .... | Tom | | 35 | 1 | ..... | .......|
| 2 | .... | Rob | | 36 | 2 | ..... | .......|
| 3 | .... | Dan | | 37 | 1 | ..... | .......|
+--------+-------+------+ | 43 | 3 | ..... | .......|
| 48 | 2 | ..... | .......|
| 49 | 3 | ..... | .......|
| 53 | 2 | ..... | .......|
+---------+---------+--------+--------+
+------------------+
| ++ Favs |
| userID | photoID |
+--------+---------+
| 1 | 37 |
| 1 | 48 |
| 2 | 37 |
+--------+---------+
With this approach, you link the data you have cleanly, efficiently and without too much data replication.

MySQL Multi Duplicate Record Merging

A previous DBA managed a non relational table with 2.4M entries, all with unique ID's. However, there are duplicate records with different data in each record for example:
+---------+---------+--------------+----------------------+-------------+
| id | Name | Address | Phone | Email | LastVisited |
+---------+---------+--------------+---------+------------+-------------+
| 1 | bob | 12 Some Road | 02456 | | |
| 2 | bobby | | 02456 | bob#domain | |
| 3 | bob | 12 Some Rd | 02456 | | 2010-07-13 |
| 4 | sir bob | | 02456 | | |
| 5 | bob | 12SomeRoad | 02456 | | |
| 6 | mr bob | | 02456 | | |
| 7 | robert | | 02456 | | |
+---------+---------+--------------+---------+------------+-------------+
This isnt the exact table - the real table has 32 columns - this is just to illustrate
I know how to identify the duplicates, in this case i'm using the phone number. I've extracted the duplicates into a seperate table - there's 730k entires in total.
What would be the most efficient way of merging these records (and flagging the un-needed records for deletion)?
I've looked at using UPDATE with INNER JOIN's, but there are several WHERE clauses needed, because i want to update the first record with data from subsequent records, where that subsequent record has additional data the former record does not.
I've looked at third party software such as Fuzzy Dups, but i'd like a pure MySQL option if possible
The end goal then is that i'd be left with something like:
+---------+---------+--------------+----------------------+-------------+
| id | Name | Address | Phone | Email | LastVisited |
+---------+---------+--------------+---------+------------+-------------+
| 1 | bob | 12 Some Road | 02456 | bob#domain | 2010-07-13 |
+---------+---------+--------------+---------+------------+-------------+
Should i be looking at looping in a stored procedure / function or is there some real easy thing i've missed?
U have to create a PROCEDURE, but before that
create ur own temp_table like :
Insert into temp_table(column1, column2,....) values (select column1, column2... from myTable GROUP BY phoneNumber)
U have to create the above mentioned physical table so that u can run a cursor on it.
create PROCEDURE myPROC
{
create a cursor on temp::
fetch the phoneNumber and id of the current row from the temp_table to the local variable(L_id, L_phoneNum).
And here too u need to create a new similar_tempTable which will contain the values as
Insert into similar_tempTable(column1, column2,....) values (Select column1, column2,.... from myTable where phoneNumber=L_phoneNumber)
The next step is to extract the values of each column u want from similar_tempTable and update into the the row of myTable where id=L_id and delete the rest duplicate rows from myTable.
And one more thing, truncate the similar_tempTable after every iteration of the cursor...
Hope this will help u...