Combine rows from two tables with different columns? - mysql

I'm having a hard time wrapping my head around this one. I believe it's happening because I am joining the two separate tables based on the same column (user_id), but I don't know how to fix it because the only thing in common between the two tables IS the user_id column.
Here is the query.
SELECT users_data_existing.`date`,`message`,`action`,`status`,`data`,
users_data_new.`date`,`data_new`
FROM users_data_existing
INNER JOIN users_data_action USING (action_id)
INNER JOIN users_data_status_user USING (status_user_id)
INNER JOIN `users` USING (user_id)
INNER JOIN users_data_new USING (user_id)
INNER JOIN data ON users_data_existing.`data_id` = data.`id`
WHERE users_data_existing.`user_id` = 2
ORDER BY users_data_existing.`date`,users_data_new.`date` DESC
The result, is that the users_data_new.date and data_new columns, are concatenated or "appended" to the previous rows.
+----------+-----------+-----------+-----------+-----------+----------+-----------+
| date | message | action | status | data | date | data_new |
+----------+-----------+-----------+-----------+-----------+----------+-----------+
|2011-01-01| data | data | data | data |2011-01-02| data_new |
-----------------------------------------------------------------------------------
|2011-01-01| data | data | data | data |2011-01-03| data_new1 |
-----------------------------------------------------------------------------------
REPEATS PATTERN FOR TOTAL RECORDS IN users_data_new TABLE
+----------+-----------+-----------+-----------+-----------+----------+-----------+
| date | message | action | status | data | date | data_new |
+----------+-----------+-----------+-----------+-----------+----------+-----------+
|2011-01-01| data1 | data1 | data1 | data1 |2011-01-02| data_new |
-----------------------------------------------------------------------------------
|2011-01-01| data1 | data1 | data1 | data1 |2011-01-03| data_new1 |
-----------------------------------------------------------------------------------
But that's not what I need. How can I get the last two columns into a separate row? I think a UNION would resolve this but I can't do that because the tables are almost identical but don't share the message column.

As suspected in the question, it was a UNION that I needed. The trick was to create an empty column in users_data_new to match users_data_existing. I also had a challenge with sorting it so I will include that here as well.
(SELECT data_existing.date AS submitdate,status_user.status,action.action,
data.data,data_existing.message
FROM users_data_existing AS data_existing
INNER JOIN users_requested_status_user status_user
ON data_existing.status_user_id = status_user.status_user_id
INNER JOIN users_requested_action action
ON data_existing.action_id = action.action_id
INNER JOIN websites data
ON data_existing.data_id = data.id
ORDER BY data_existing.date DESC) //sorts sub-query
UNION ALL
(SELECT data_new.date AS submitdate,status_user.status,
action.action,data_new.data_new,'' message //needed to add this last empty column
FROM users_data_new AS data_new
INNER JOIN users_requested_status_user status_user
ON data_new.status_user_id = status_user.status_user_id
INNER JOIN users_requested_action action
ON data_new.action_id = action.action_id
ORDER BY data_new.date DESC) //sorts sub-query
ORDER BY submitdate DESC"; //sorts the entire result
Keep in mind that with the alias for the date, the associative array key will be whatever alias name you use. i.e. $result['submitdate']

Related

SUM() over a many-to-many relation in MYSQL

I have tables stanje and transakcija in a many-to-many relation, as shown in the image:
I need a MYSQL clause that returns all rows in stanje joined by a SUM() of every transakcija.iznos connected to a given stanje .
So far I have tried
select SUM(t.iznos)
from transakcija t
where transakcija_id in
(select transakcija_id from stanje_transakcija where stanje_id = ?)
which returns the SUM() correctly when given a stanje_id, but have no idea how to proceed, since I need sums for all rows in stanje.
Edit: added example output
------------------------------------
| stanje_id | naziv | SUM(t.iznos) |
------------------------------------
| 1 | a | 125.2 |
| 2 | b | -42.2 |
------------------------------------
If I understand correctly, you need to use JOIN in thoes tables by transakcija_id column and stanje_id column.
From your expect result you can try to use SUM with GROUP BY
select t2.stanje_id,t2.naziv,SUM(t.iznos)
from transakcija t
INNER JOIN stanje_transakcija t1 on t.transakcija_id = t1.transakcija_id
INNER JOIN stanje t2 on t2.stanje_id = t1.stanje_id
GROUP BY t2.stanje_id,t2.naziv

Issue with mysql query that calls column name from another table

I have two tables, one is an index (or map) which helps when other when pulling queries.
SELECT v.*
FROM smv_ v
WHERE (SELECT p.network
FROM providers p
WHERE p.provider_id = v.provider_id) = 'RUU='
AND (SELECT p.va
FROM providers p
WHERE p.provider_id = v.provider_id) = 'MjU='
LIMIT 1;
Because we do not know the name of the column that holds the main data, we need to look it up, using the provider_id which is in both tables, and then query.
I am not getting any errors, but also no data back. I have spent the past hour trying to put this on sqlfiddle, but it kept crashing, so I just wanted to check if my code is really wrong, hence the crashing?
In the above example, I am looking in the providers table for column network, where the provider_id matches, and then use that as the column on smv.
I am sure i have done this before just like this, but after the weekend trying I thought i would ask on here.
Thanks in Advance.
UPDATE
Here is an example of the data:
THis is the providers, this links so no matter what the name of the column on the smv table, we can link them.
+---+---+---------------+---------+-------+--------+-----+-------+--------+
| | A | B | C | D | E | F | G | H |
+---+---+---------------+---------+-------+--------+-----+-------+--------+
| 1 | 1 | Home | network | batch | bs | bp | va | bex |
| 2 | 2 | Recharge | code | id | serial | pin | value | expire |
+---+---+---------------+---------+-------+--------+-----+-------+--------+
In the example above, G will mean in the smv column for recharge we be value. So that is what we would look for in our WHERE clause.
Here is the smv table:
+---+---+-----------+-----------+---+----+---------------------+-----+--+
| | A | B | C | D | E | F | value | va |
+---+---+-----------+-----------+---+----+---------------------+-----+--+
| 1 | 1 | X2 | Home | 4 | 10 | 2016-09-26 15:20:58 | | 7 |
| 2 | 2 | X2 | Recharge | 4 | 11 | 2016-09-26 15:20:58 | 9 | |
+---+---+-----------+-----------+---+----+---------------------+-----+--+
value in the same example as above would be 9, or 'RUU=' decoded.
So we do not know the name of the rows, until the row from smv is called, once we have this, we can look up what column name we need to get the correct information.
Hope this helps.
MORE INFO
At the point of triggering, we do not know what the row consists of the right data because some many of the fields would be empty. The map is there to help we query the right column, to get the right row (smv grows over time depending on whats uploaded.)
1) SELECT p.va FROM providers p WHERE p.network = 'Recharge' ;
2) SELECT s.* FROM smv s, providers p WHERE p.network = 'Recharge';
1) gives me the correct column I need to look up and query smv, using the above examples it would come back with "value". So I need to now look up, within the smv table, C = Recharge, and value = '9'. This should bring me back row 2 of the smv table.
So individually both 1 and 2 queries work, but I need them put together so the query is done on the database server.
Hope this gives more insight
Even More Info
From reading other posts, which are not really doing what I need, i have come up with this:
SELECT s.*
FROM (SELECT
(SELECT p.va
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS net,
(SELECT p.bex
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS bex
FROM dh_smv_providers) AS val, dh_smv_ s
WHERE s.provider_id = 'vodaphone' AND net = '20'
ORDER BY from_base64(val.bex) DESC;
The above comes back blank, but if i replace net, in the WHERE clause with a column I know exists, I do get the results expected:
SELECT s.*
FROM (SELECT
(SELECT p.va
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS net,
(SELECT p.bex
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS bex
FROM dh_smv_providers) AS val, dh_smv_ s
WHERE s.provider_id = 'vodaphone' AND value = '20'
ORDER BY from_base64(val.bex) DESC;
So what I am doing wrong, which is net, not showing the value derived from the subquery "value" ?
Thanks
SELECT
v.*,
p.network, p.va
FROM
smv_ v
INNER JOIN
providers p ON p.provider_id = v.provider_id
WHERE
p.network = 'RUU=' AND p.va = 'MjU='
LIMIT 1;
The tables talk to each other via the JOIN syntax. This completely circumvents the need (and limitations) of sub-selects.
The INNER JOIN means that only fully successful matches are returned, you may need to adjust this type of join for your situation but the SQL will return a row of all v columns where p.va = MjU and p.network = RUU and p.provider_id = v.provider_id.
What I was trying to explain in comments is that subqueries do not have any knowledge of their outer query:
SELECT *
FROM a
WHERE (SELECT * FROM b WHERE a)
AND (SELECT * FROM c WHERE a OR b)
This layout (as you have in your question) is that b knows nothing about a because the b query is executed first, then the c query, then finally the a query. So your original query is looking for WHERE p.provider_id = v.provider_id but v has not yet been defined so the result is false.

Join tables keeping empty results

I have two tables that I need to cross and return as many results as the ids of one of them.
The first is a table of roles/tasks:
id | rolename
---+---------
1 | check_in
2 | cleaning
3 | taxi
4 | guide
5 | car_rental
6 | meals
7 | house_owner
20 | custom
and another table that has the columns:
id | client_booking_id | staff_role_id | confirmed | staff_cost
I need a query that always gives me as many results as nr of columns in the first table. Because for each unique client_booking_id there will be only one (if any) of those tasks/roles.
So if I do:
SELECT sr.role_name, sr.id, ss.staff_cost, ss.confirmed
FROM staff_role AS sr
LEFT JOIN staff_schedule AS ss ON sr.id=ss.staff_role_id
I get a result with the nr of rows I want. Now I need to match it with a specific client_booking_id so I did
SELECT sr.role_name, sr.id, ss.staff_cost, ss.confirmed
FROM staff_role AS sr
LEFT JOIN staff_schedule AS ss ON sr.id=ss.staff_role_id
WHERE ss.client_booking_id=1551 // <-- this is the new line
And this gives me only 2 results because in the second table I have only booked 2 tasks to a id.
But I need a result with all tasks even those that do not match, with NULL values. How can I do this?
With your query (without where clause) you get rows with null and non-null values for client_booking_id. You want to match specific client_booking_id and at the same time leave all records with null values, so you add additional condition with specific client_booking_id to left join.
Moving condition to left join:
select sr.role_name
, sr.id
, ss.staff_cost
, ss.confirmed
from staff_role sr
left join staff_schedule ss on sr.id = ss.staff_role_id
and ss.client_booking_id = 1551

Select values based on other value within joined tables SQL

I would like to ask for help for an SQL request that give me values from two tables.
As an example I have one Table orders und one table processing.
I would like to make an report of the orders and the state of processing.
table orders
id | status | div
-------------------
1 | wating_r | div1
2 | closed | div2
3 | closed | div3
-
table processing:
id | order_id | type | date
----------------------------------------
1 | 2 | send_request | 15.01.15
2 | 2 | send_invoice | 30.01.15
3 | 1 | send_request | 01.02.15
4 | 3 | send_request2 | 10.02.15
5 | 3 | send_invoice | 15.02.15
what I would like to get:
order_id | status | date_request | date_request2 | date_invoice
--------------------------------------------------------------------------------
1 | waiting_r | 01.02.15 | NULL | NULL
2 | closed | 15.01.15 | NULL | 30.01.15
3 | closed | NULL | 10.02.15 | 15.02.15
my solution:
select orders.id as order_id, orders.status, IF(processing.type='send_invoice',date_format(processing.date, '%Y-%m-%d'), NULL) as date_invoice, IF(processing.type='send_request',date_format(processing.date, '%Y-%m-%d'), NULL) as date_request, IF(processing.type='send_request2',date_format(processing.date, '%Y-%m-%d'), NULL) as date_request2
from orders
inner join processing on orders.id = processing.order_id
where
case
when orders.status='closed' then processing.type='send_invoice'
when orders.status='waiting_r' then processing.type='send_request'
when orders.status='waiting_2'then processing.type='send_request2'
end
This works fine but with this IF statements I doesn't become the dates from the requests when an invoice was sent - I only get the date of the invoice.
Instead of the case request I tried the following but in this case I have more than one line for every order. When I tried to "group by" I have mixed data.
where
processing.type in ('send_invoice', 'send_request', 'completion_request_send')
You need to left-join the second table to the first three times, like so.
SELECT o.id AS order_id, o.status,
p1.date AS date_request,
p2.date AS date_request2,
p3.date AS date_invoice
FROM orders o
LEFT JOIN processing p1 ON o.id = p1.order_id AND p1.type='send_request'
LEFT JOIN processing p2 ON o.id = p2.order_id AND p2.type='send_request2'
LEFT JOIN processing p3 ON o.id = p3.order_id AND p3.type='send_invoice'
ORDER BY 1,2
This left-join with an id-matching criterion and the specific type choice pulls out the rows you need for each column. Left, as opposed to inner, join, allows the missing values to be shown as null.
Here it is, working. http://sqlfiddle.com/#!9/b8c74/5/0
This is a typical pattern for joining a key/value table where the (id/key) pairs are unique.
Edit Unfortunately it generates duplicate result set rows in situations where there's a duplicate key for a particular value. To deal with that, it's necessary to deduplicate the key/value table (processing) in this case.
This subquery will do that, taking the latest date value.
SELECT type, order_id, MAX(date) AS date
FROM processing
GROUP BY type, order_id
Then you have to use that subquery in the main query. This is where it would be good if MySQL had common table expressions. But it doesn't so things get kind of verbose.
SELECT o.id AS order_id, o.status,
p1.date AS date_request,
p2.date AS date_request2,
p3.date AS date_invoice
FROM orders o
LEFT JOIN (
SELECT type, order_id, MAX(date) AS date
FROM processing
GROUP BY type, order_id
) p1 ON o.id = p1.order_id AND p1.type='send_request'
LEFT JOIN (
SELECT type, order_id, MAX(date) AS date
FROM processing
GROUP BY type, order_id
) p2 ON o.id = p2.order_id AND p2.type='send_request2'
LEFT JOIN (
SELECT type, order_id, MAX(date) AS date
FROM processing
GROUP BY type, order_id
) p3 ON o.id = p3.order_id AND p3.type='send_invoice'
ORDER BY 1,2

Faster sql query then join

I have a big table with more than 10,000 rows and it will grow to 1,000,000 in the near future, and I need to run a query which gives back a Time value for each keyword for each user. I have one right now which is quite slow because I use left joins and it needs one subquery / keyword:
SELECT rawdata.user, t1.Facebook_Time, t2.Outlook_Time, t3.Excel_time
FROM
rawdata left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Facebook_Time'
FROM rawdata
WHERE MainWindowTitle LIKE '%Facebook%'
GROUP by user)t1 on rawdata.user = t1.user left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Outlook_Time'
FROM rawdata
WHERE MainWindowTitle LIKE '%Outlook%'
GROUP by user)t2 on rawdata.user = t2.user left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Excel_Time'
FROM rawdata
WHERE MainWindowTitle LIKE '%Excel%'
GROUP by user)t3 on rawdata.user = t3.user
The table looks like this:
WindowTitle | StartTime | EndTime | User
------------|-----------|---------|---------
Form1 | DateTime | DateTime| user1
Form2 | DateTime | DateTime| user2
... | ... | ... | ...
Form_n | DateTime | DateTime| user_n
The output should looks like this:
User | Keyword | SUM(EndTime-StartTime)
-------|-----------|-----------------------
User1 | 'Facebook'| 00:34:12
User1 | 'Outlook' | 00:12:34
User1 | 'Excel' | 00:43:13
User2 | 'Facebook'| 00:34:12
User2 | 'Outlook' | 00:12:34
User2 | 'Excel' | 00:43:13
... | ... | ...
User_n | ... | ...
And the question is, which is the fastest way in MySQL to do this?
I think your wildcard searches are probably what's slowing it down the most, since you can't really utilize indexes on those fields. Also if you can avoid doing sub-queries and just do a straight join, it might help, but the wildcard searches are far worse. Is there anyway you could change the table to have a categoryName or categoryID that can have an index and not require a wildcard search? Like "where categoryName = 'Outlook'"
To optimize the data in your tables, add a categoryID (ideally this would reference a separate table, but let's just use arbitrary numbers for this example):
alter table rawData add column categoryID int not null
alter table rawData add index (categoryID)
Then populate the categoryID field for the existing data:
update rawData set categoryID=1 where name like '%Outlook%'
update rawData set categoryID=2 where name like '%Facebook%'
-- etc...
Then change your insert to follow the same rules.
Then make your SELECT query like this (changed wild cards to categoryID):
SELECT rawdata.user, t1.Facebook_Time, t2.Outlook_Time, t3.Excel_time
FROM
rawdata left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Facebook_Time'
FROM rawdata
WHERE categoryID = 2
GROUP by user)t1 on rawdata.user = t1.user left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Outlook_Time'
FROM rawdata
WHERE categoryID = 1
GROUP by user)t2 on rawdata.user = t2.user left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Excel_Time'
FROM rawdata
WHERE categoryID = 3
GROUP by user)t3 on rawdata.user = t3.user