Mysql selecting selecting from two tables for comparison - mysql

I've searched the other related threads, but I don't think I'm looking for a UNION or an OUTER JOIN. What I'm trying to do is pretty simple in theory. I have two tables in two different databases, both with roughly the same data. I'm trying to present them together so that we can compare them. The field names are different, but the data is very similar.
Imagine something like this:
table 'foo':
id first_name last_name dept_name
+---+----------+---------------+-------------+
| 1|Bob |Boberson | Accounting |
| 2|Steven |McStevens | Sales |
| 3|Jane |Janeston | Support |
+---+----------+---------------+-------------+
table 'bar':
person_id first last department_id
+----------+----------+---------------+--------------+
| 1|Bob |Boberson | 2|
| 2|Doug |Dugger | 5|
| 3|Jane |Janeston | 3|
+----------+----------+---------------+--------------+
and I'm trying to end up with something like this:
person_id first last department
+----------+----------+---------------+--------------+
| foo_1|Bob |Boberson | Accounting |
| foo_2|Steven |McStevens | Sales |
| foo_3|Jane |Janeston | Support |
| bar_1|Bob |Boberson | Accounting |
| bar_2|Doug |Dugger | IT |
| bar_3|Jane |Janeston | Support |
+----------+----------+---------------+--------------+
It's easy enough to get the two tables to resemble each other with two separate selects using 'as' to change the column names, concat's for various fields, and doing the appropriate join to fill in the 'department' fields. But, I can't do a 'join' and keep that logic in place. I really need to do a select statement for each table. There's probably a simple solution here, but I'm not seeing it.
EDIT: You guys are correct, this is a pretty standard case for a UNION. I was thinking that UNIONS always add columns for some reason. Thanks.

I think you can use UNION if you don't want to trim any duplicate names out then that will get data from both tables in 1 query, you can get the table name as part of your query to prefix the person_id if you want
This needs testing/improving but:
(SELECT foo AS table, f.id AS person_id, f.first_name AS first, f.last_name AS last, f.dept_name FROM foo AS f)
UNION
(SELECT bar AS table, b.person_id, b.first, b.last, FROM bar AS b)

This looks exactly like a UNION
Have a look as SQL Fiddle, I didn't bother doing a lookup for dept ID, but this should give you the basic idea.

Related

MySQL subquery or join different databases where database name is in the main query's result set

For example, I need to list all customers and count their users in this simplified self explanatory SQL:
SELECT customer_name,
(SELECT COUNT(*) FROM <main_db.customers.database_name>.users) AS user_count
FROM main_db.customers
Let's say this is main_db structure:
+--------------------------------+
| customer_name | database_name |
|--------------------------------|
| Customer One | customer_db1 |
| Customer Two | customer_db2 |
| Customer Three | customer_db3 |
| etc... |
+--------------------------------+
And this is customer_dbX structure:
+------------+
| users |
|------------|
| User One |
| User Two |
| User Three |
| etc... |
+------------+
I want to receive this result set:
+-----------------------------+
| customer_name | user_count |
|-----------------------------|
| Customer One | 12 |
| Customer Two | 59 |
| Customer Three | 34 |
| etc... |
+-----------------------------+
Is this possible with a subquery, join or ANY syntax?
This is too long for a comment.
No, you cannot do what you want as a single query. You can use dynamic sql making use of the information_schema.schemata table.
But, perhaps there are other solutions. First, I would recommend that you not use separate databases for different customers, unless you absolutely have to. Here are a few reasons why you have to:
You are contractually obligated to use separate databases (perhaps because lawyer's don't fully understand database security).
The databases have different backup/restore requirements.
Different customers will have customer-specific customizations that are most easily handled as different databases.
Under most circumstances, storing data for multiple customers in a single database is the right way to go. It certainly simplifies managing the system, upgrading to new versions, identifying and fixing bugs, backing up the database, replicating the system in case of failure, and so on and so on.
If, though, you have to have separate databases, then consider creating a view in the master database that combines all the tables together:
create view v_master_users as
select 'x' as which, d.* from customer_db<X> d union all
select 'x1' as which, d.* from customer_db<x> d union all
. . .;
Then, use this view for your querying.
If adding a customer requires creating a database, then you'll have ample opportunity to update the view to handle new customers.
If I correctly understood your problem, I would use UNION:
SELECT CUSTOMER_NAME FROM MAIN_DB.CUSTOMERS ORDER BY ID ASC
UNION
SELECT COUNT(*) FROM MAIN_DB.CUSTOMERS.USERS GROUP BY CUSTOMERS ORDER BY ID ASC

Dynamic value to display numbers of entries in second table

I've got multiple entries in table A and would like to display the number of entries in a coloumn of table B. Is there a way to create a dynamic cell-content displaying the number of entries in a table?
I'm a beginner in MySQL and did not find a way to do it so far.
Example table A:
+----+------+------------+
| id | name | birthday |
+----+------+------------+
| 1 | john | 1976-11-18 |
| 2 | bill | 1983-12-21 |
| 3 | abby | 1991-03-11 |
| 4 | lynn | 1969-08-02 |
| 5 | jake | 1989-07-29 |
+----+------+------------+
What I'd like in table B:
+----+------+----------+
| id | name | numusers |
| 1 | tblA | 5 |
+----+------+----------+
In my actual database there is no incrementing ID so just taking the last value would not work - if this would've been a solution.
If MySQL can't handle this the option would be to create some kind of cronjob on my server reading the number of rows and writing them into that cell. I know how to do this - just checking if there's another way.
I'm not looking for a command to run on the mysql-console. What I'm trying to figure out is if there's some option which dynamically changes the cell's value to what I've described above.
You can create a view that will give you this information. The SQL for this view is inspired by an answer to a similar question:
CREATE VIEW table_counts AS
SELECT table_name, table_rows
FROM information_schema.tables
WHERE table_schema = '{your_db}';
The view will have the cells you speak of. As you can see, it is just a filter on an already existing table, so you might consider that this table information_schema.tables is the answer to your question.
You can do that directly with COUNT() for example SELECT COUNT(*) FROM TblA The you get all rows from that table. If you IDXs are ok then its very fast. If you write it to another table you have to make an request too to get the result of the second table. So i think your can do it directly.
If you have some performance problems there are some other possibilities like Triggers or Stored Procedures to calculate that result and save them in a memory table to get a better performance.

MySQL Table structure: Multiple attributes for each item

I wanted to ask you which could be the best approach creating my MySQL database structure having the following case.
I've got a table with items, which is not needed to describe as the only important field here is the ID.
Now, I'd like to be able to assign some attributes to each item - by its ID, of course. But I don't know exactly how to do it, as I'd like to keep it dynamic (so, I do not have to modify the table structure if I want to add a new attribute type).
What I think
I think - and, in fact, is the structure that I have right now - that I can make a table items_attributes with the following structure:
+----+---------+----------------+-----------------+
| id | item_id | attribute_name | attribute_value |
+----+---------+----------------+-----------------+
| 1 | 1 | place | Barcelona |
| 2 | 2 | author_name | Matt |
| 3 | 1 | author_name | Kate |
| 4 | 1 | pages | 200 |
| 5 | 1 | author_name | John |
+----+---------+----------------+-----------------+
I put data as an example for you to see that those attributes can be repeated (it's not a relation 1 to 1).
The problem with this approach
I have the need to make some querys, some of them for statistic purpouses, and if I have a lot of attributes for a lot of items, this can be a bit slow.
Furthermore - maybe because I'm not an expert on MySQL - everytime I want to make a search and find "those items that have 'place' = 'Barcelona' AND 'author_name' = 'John'", I end up having to make multiple JOINs for every condition.
Repeating the example before, my query would end up like:
SELECT *
FROM items its
JOIN items_attributes attr
ON its.id = attr.item_id
AND attr.attribute_name = 'place'
AND attr.attribute_value = 'Barcelona'
AND attr.attribute_name = 'author_name'
AND attr.attribute_value = 'John';
As you can see, this will return nothing, as an attribute_name cannot have two values at once in the same row, and an OR condition would not be what I'm searching for as the items MUST have both attributes values as stated.
So the only possibility is to make a JOIN on the same repeated table for every condition to search, which I think it's very slow to perform when there are a lot of terms to search for.
What I'd like
As I said, I'd like to be able to keep the attributes types dynamical, so by adding a new input on 'attribute_name' would be enough, without having to add a new column to a table. Also, as they are 1-N relationship, they cannot be put in the 'items' table as new columns.
If the structure, in your opinion, is the only one that can acheive my interests, if you could light up some ideas so the search queries are not a ton of JOINs it would be great, too.
I don't know if it's quite hard to get it as I've been struggling my head until now and I haven't come up with a solution. Hope you guys can help me with that!
In any case, thank you for your time and attention!
Kind regards.
You're thinking in the right direction, the direction of normalization. The normal for you would like to have in your database is the fifth normal form (or sixth, even). Stackoverflow on this matter.
Table Attribute:
+----+----------------+
| id | attribute_name |
+----+----------------+
| 1 | place |
| 2 | author name |
| 3 | pages |
+----+----------------+
Table ItemAttribute
+--------+----------------+
| item_id| attribute_id |
+--------+----------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
+--------+----------------+
So for each property of an object (item in this case) you create a new table and name it accordingly. It requires lots of joins, but your database will be highly flexible and organized. Good luck!
In my Opinion it should be something like this, i know there are a lot of table, but actually it normilizes your DB
Maybe that is why because i cant understant where you get your att_value column, and what should contains this columns

mysql select from 2 other columns in the same table

I have a table which looks like this but much longer...
| CategoryID | Category | ParentCategoryID |
+------------+----------+------------------+
| 23 | Screws | 3 |
| 3 | Packs | 0 |
I am aiming to retrieve one column from this which in this instance would give me the following...
| Category |
+--------------+
| Packs/Screws |
Please excuse me for not knowing exactly how to word this, so far I can only think to split the whole table into multiple tables and use LEFT JOIN, this seems like a very good opportunity for a learning curve however.
I realise that CONCAT() will come into play when combining the two retrieved Category names but beyond that I am stumped.
SELECT CONCAT(x.category,'/',y.category) Category
FROM my_table x
JOIN my_table y
ON y.categoryid = x.parentcategoryid
[WHERE x.parentcategoryid = 0]

Is this good Database Normalization?

I am a beginner at using mysql and I am trying to learn the best practices. I have setup a similar structure as seen below.
(main table that contains all unique entries) TABLE = 'main_content'
+------------+---------------+------------------------------+-----------+
| content_id | (deleted) | title | member_id |
+------------+---------------+------------------------------+-----------+
| 6 | | This is a very spe?cal t|_st | 1 |
+------------+---------------+------------------------------+-----------+
(Provides the total of each difficulty and joins id --> actual name) TABLE = 'difficulty'
+---------------+-------------------+------------------+
| difficulty_id | difficulty_name | difficulty_total |
+---------------+-------------------+------------------+
| 1 | Absolute Beginner | 1 |
| 2 | Beginner | 1 |
| 3 | Intermediate | 0 |
| 4 | Advanced | 0 |
| 5 | Expert | 0 |
+---------------+-------------------+------------------+
(This table ensures that multiple values can be inserted for each entry. For example,
this specific entry indicates that there are 2 difficulties associated with the submission)
TABLE = 'lookup_difficulty'
+------------+---------------+
| content_id | difficulty_id |
+------------+---------------+
| 6 | 1 |
| 6 | 2 |
+------------+---------------+
I am joining all of this into a readable query:
SELECT group_concat(difficulty.difficulty_name) as difficulty, member.member_name
FROM main_content
INNER JOIN difficulty ON difficulty.difficulty_id
IN (SELECT difficulty_id FROM main_content, lookup_difficulty WHERE lookup_difficulty.content_id = main_content.content_id )
INNER JOIN member ON member.member_id = main_content.member_id
The above works fine, but I am wondering if this is good practice. I practically followed the structure laid out Wikipedia's Database Normalization example.
When I run the above query using EXPLAIN, it says: 'Using where; Using join buffer' and also that I am using 2 DEPENDENT SUBQUERY (s) . I don't see any way to NOT use sub-queries to achieve the same affect, but then again I'm a noob so perhaps there is a better way....
The DB design looks fine - regarding your query, you could rewrite it exclusively with joins like:
SELECT group_concat(difficulty.difficulty_name) as difficulty, member.member_name
FROM main_content
INNER JOIN lookup_difficulty ON main_content.id = lookup_difficulty.content_id
INNER JOIN difficulty ON difficulty.id = lookup_difficulty.difficulty_id
INNER JOIN member ON member.member_id = main_content.member_id
If the lookup_difficulty provides a link between content and difficulty I would suggest you take out the difficulty_id column from your main_content table. Since you can have multiple lookups for each content_id, you would need some extra business logic to determine which difficulty_id to put in your main_content table (or multiple entries in the main_content table for each difficulty_id, but that goes against normalization practices). For ex. the biggest value / smallest value / random value. In either case, it does not make much sense.
Other than that the table looks fine.
Update
Saw you updated the table :)
Just as a side-note. Using IN can slow down your query (IN can cause a table-scan). In any case, it used to be that way, but I'm sure that these days the SQL compiler optimizes it pretty well.