sql join data from two tables - mysql

I wonder if someone help me to join data from two tables...spending all the day didn't manage...
Code 1 selects:
Year | Turnover1 | Quantity1 | EurPerOrder1
SELECT Year(table1.ContractDate) AS Year,
Sum(table1.TPrice) AS Turnover1,
Count(table1.id) AS Quantity1,
ROUND(Sum(table1.TPrice) / Count(table1.id), 0) AS EurPerOrder1
FROM table1
GROUP BY Year(table1.ContractDate) * 100
ORDER BY table1.ContractDate DESC
Code2 selects:
Year | Turnover2 | Quantiry2 | EurPerOrder2
SELECT Year(table2.date) AS Year,
Sum(table2.price) AS Turnover2,
Count(table2.rid) AS Quantiry2,
ROUND(Sum(table2.price) / Count(table2.rid), 0) AS EurPerOrder2
FROM table2
GROUP BY Year(table2.date) * 100
ORDER BY table2.date DESC
And I need to join data like:
Year | Turnover1 | Quantity1 | EurPerOrder1 | Turnover2 | Quantiry2 | EurPerOrder2
I need to have all data from both tables grouped by years. Even table2 doesnt have year 2013 anyway I would like it showed 0 or empty...
I have tried different ways using examples but nothing worked so I think the problem can occur because second table doesn't have all the years which are on table1...

First: you can read pretty good explanation about the JOINS here
Ok, according the question you need LEFT JOIN. This means all data from table1 and only matching data from table2.
The SELECT must look like:
SELECT Year(table1.ContractDate) AS Year,
Sum(table1.TPrice) AS Turnover1,
Count(table1.id) AS Quantiry1,
ROUND(Sum(table1.TPrice) / Count(table1.id), 0) AS EurPerOrder1,
Sum(table2.price) AS Turnover2,
Count(table2.rid) AS Quantiry2,
ROUND(Sum(table2.price) / Count(table2.rid), 0) AS EurPerOrder2
FROM
table1 t1
LEFT JOIN table2 t2 ON Year(table1.ContractDate) = Year(table2.date)
GROUP BY
Year(table1.ContractDate) * 100, Year(table2.date) * 100
ORDER BY
table1.ContractDate DESC, table2.date DESC
Of course you need to process NULL values. See link
Please check SQL and correct it if there are erreors. I don't have live data to check (by running it).

Related

Remove continuous duplicated values with different IDs in MySQL

I know there is a ton of same questions about finding and removing duplicate values in mySQL but my question is a bit different:
I have a table with columns as ID, Timestamp and price. A script scrapes data from another webpage and saves it in the database every 10 seconds. Sometimes data ends up like this:
| id | timestamp | price |
|----|-----------|-------|
| 1 | 12:13 | 100 |
| 2 | 12:14 | 120 |
| 3 | 12:15 | 100 |
| 4 | 12:16 | 100 |
| 5 | 12:17 | 110 |
As you see there are 3 duplicated values and removing the price with ID = 4 will shrink the table without damaging data integrity. I need to remove continuous duplicated records except the first one (which has the lowest ID or Timestamp).
Is there a sufficient way to do it? (there is about a million records)
I edited my scraping script so it checks for duplicated price before adding it but I need to shrink and maintain my old data.
Since MySQL 8.0 you can use window function LAG() in next way:
delete tbl.* from tbl
join (
-- use lag(price) for get value from previous row
select id, lag(price) over (order by id) price from tbl
) l
-- join rows with same previous price witch will be deleted
on tbl.id = l.id and tbl.price = l.price;
fiddle
I am just grouping based on price and filtering only one record per group.The lowest id gets displayed.Hope the below helps.
select id,timestamp,price from yourTable group by price having count(price)>0;
My query is based on #Tim Biegeleisen one.
-- delete records
DELETE
FROM yourTable t1
-- where exists an older one with the same price
WHERE EXISTS (SELECT 1
FROM yourTable t2
WHERE t2.price = t1.price
AND t2.id < t1.id
-- but does not exists any between this and the older one
AND NOT EXISTS (SELECT 1
FROM yourTable t3
WHERE t1.price <> t3.price
AND t3.id > t2.id
AND t3 < t1.id));
It deletes records where exists an older one with same price but does not exists any different between
It could be checked by timestamp column if id column is not numeric and ascending.

Mysql each row sum

How can I get result like below with mysql?
> +--------+------+------------+
> | code | qty | total |
> +--------+------+------------+
> | aaa | 30 | 75 |
> | bbb | 20 | 45 |
> | ccc | 25 | 25 |
> +--------+------+------------+
total is value of the rows and the others that comes after this.
You can do this with a correlated subquery -- assuming that the ordering is alphabetical:
select code, qty,
(select sum(t2.qty)
from mytable t2
where t2.code >= t.code
) as total
from mytable t;
SQL tables represent unordered sets. So, a table, by itself, has no notion of rows coming after. In your example, the codes are alphabetical, so they provide one definition. In practice, there is usually an id or creation date that serves this purpose.
I would use join, imho usually fits better.
Data:
create table tab (
code varchar(10),
qty int
);
insert into tab (code, qty)
select * from (
select 'aaa' as code, 30 as qty union
select 'bbb', 20 union
select 'ccc', 25
) t
Query:
select t.code, t.qty, sum(t1.qty) as total
from tab t
join tab t1 on t.code <= t1.code
group by t.code, t.qty
order by t.code
The best way is to try both queries (my and with subquery that #Gordon mentioned) and choose the faster one.
Fiddle: http://sqlfiddle.com/#!2/24c0f/1
Consider using variables. It looks like:
select code, qty, (#total := ifnull(#total, 0) + qty) as total
from your_table
order by code desc
...and reverse query results list afterward.
If you need pure SQL solution, you may compute sum of all your qty values and store it in variable.
Also, look at: Calculate a running total in MySQL

Identifying groups in Group By

I am running a complicated group by statement and I get all my results in their respective groups. But I want to create a custom column with their "group id". Essentially all the items that are grouped together would share an ID.
This is what I get:
partID | Description
-------+---------+--
11000 | "Oven"
12000 | "Oven"
13000 | "Stove"
13020 | "Stove"
12012 | "Grill"
This is what I want:
partID | Description | GroupID
-------+-------------+----------
11000 | "Oven" | 1
12000 | "Oven" | 1
13000 | "Stove" | 2
13020 | "Stove" | 2
12012 | "Grill" | 3
"GroupID" does not exist as data in any of the tables, it would be a custom generated column (alias) that would be associated to that group's key,id,index, whatever it would be called.
How would I go about doing this?
I think this is the query that returns the five rows:
select partId, Description
from part p;
Here is one way (using standard SQL) to get the groups:
select partId, Description,
(select count(distinct Description)
from part p2
where p2.Description <= p.Description
) as GroupId
from part p;
This is using a correlated subquery. The subquery is finding all the description values less than the current one -- and counting the distinct values. Note that this gives a different set of values from the ones in the OP. These will be alphabetically assigned rather than assigned by first encounter in the data. If that is important, the OP should add that into the question. Based on the question, the particular ordering did not seem important.
Here's one way to get it:
SELECT p.partID,p.Description,b.groupID
FROM (
SELECT Description,#rn := #rn + 1 AS groupID
FROM (
SELECT distinct description
FROM part,(SELECT #rn:= 0) c
) a
) b
INNER JOIN part p ON p.description = b.description;
sqlfiddle demo
This gets assigns a diferent groupID to each description, and then joins the original table by that description.
Based on your comments in response to Gordon's answer, I think what you need is a derived table to generate your groupids, like so:
select
t1.description,
#cntr := #cntr + 1 as GroupID
FROM
(select distinct table1.description from table1) t1
cross join
(select #cntr:=0) t2
which will give you:
DESCRIPTION GROUPID
Oven 1
Stove 2
Grill 3
Then you can use that in your original query, joining on description:
select
t1.partid,
t1.description,
t2.GroupID
from
table1 t1
inner join
(
select
t1.description,
#cntr := #cntr + 1 as GroupID
FROM
(select distinct table1.description from table1) t1
cross join
(select #cntr:=0) t2
) t2
on t1.description = t2.description
SQL Fiddle
SELECT partID , Description, #s:=#s+1 GroupID
FROM part, (SELECT #s:= 0) AS s
GROUP BY Description

What to do with Full Outer Join

I need a Full Outer Join in mysql. I found a solution here: Full Outer Join in MySQL My problem is that t1 and t2 are subqueries themselves. So resulting query looks like a monster.
What to do in this situation? Should I use views instead of subqueries?
Edit:
I'll try to explain a bit more. I have orders and payments. One payment can cower multiple orders, and one order can be cowered by multiple payments. That is why I have tables orders, payments, and paymentitems. Each order has field company (which made this order) and manager (which accepted this order). Now I need to group orders and payments by company and manager and count money. So I want to get something like this:
company1 | managerA | 200 | 200 | 0
company1 | managerB | Null | 100 | 100
company1 | managerC | 300 | Null | -300
company2 | managerA | 150 | Null | -150
company2 | managerB | 100 | 350 | 250
The query, I managed to create:
SELECT coalesce(o.o_company, p.o_company)
, coalesce(o.o_manager, p.o_manager)
, o.orderstotal
, p.paymentstotal
, (coalesce(p.paymentstotal, 0) - coalesce(o.orderstotal, 0)) AS balance
FROM
(((/*Subquery A*/SELECT orders.o_company
, orders.o_manager
, sum(o_money) AS orderstotal
FROM
orders
WHERE
(o_date >= #startdate)
AND (o_date <= #enddate)
GROUP BY
o_company
, o_manager) AS o
LEFT JOIN (/*Subquery B*/SELECT orders.o_company
, orders.o_manager
, sum(paymentitems.p_money) AS paymentstotal
FROM
((payments
INNER JOIN paymentitems
ON payments.p_id = paymentitems.p_id)
INNER JOIN orders
ON paymentitems.p_oid = orders.o_id)
WHERE
(payments.p_date >= #startdate)
AND (payments.p_date <= #enddate)
GROUP BY
orders.o_company
, orders.o_manager) AS p
ON (o.o_company = p.o_company) and (o.o_manager = p.o_manager))
union
(/*Subquery A*/
right join /*Subquery B*/
ON (o.o_company = p.o_company) and (o.o_manager = p.o_manager)))
This is simplified version of my query. Real query is much more complex, that is why I want to keep it as simple as it can be. Maybe even split in to views, or may be there are other options I am not aware of.
I think the clue is in "group orders and payments by company". Break the outer join into a query on orders and another query on payments, then add up the type of money (orders or payments) for each company.
If you are trying to do a full outer join and the relationship is 1-1, then you can accomplish the same thing with a union and aggreagation.
Here is an example, pulling one column from two different tables:
select id, max(col1) as col1, max(col2) as col2
from ((select t1.id, t1.col1, NULL as col2
from t1
) union all
(select t23.id, NULL as col1, t2.col2
from t2
)
) t
group by id

MySQL: Group by date proximity?

I wrote this query, it does almost what I want:
SELECT * FROM
(
SELECT COUNT(*) as cnt,
lat,
lon,
elev,
GROUP_CONCAT(CONCAT(usaf,'-',wban))
FROM `ISH-HISTORY_HASPOS`
GROUP BY lat,lon,elev
) AS x WHERE cnt >=1;
output:
+-----+--------+----------+--------+-------------------------------------------------+
| cnt | lat | lon | elev | GROUP_CONCAT(CONCAT(usaf,'-',wban)) |
+-----+--------+----------+--------+-------------------------------------------------+
| 4 | 30.478 | -87.187 | 36 | 722220-13899,722221-13899,722223-13899,999999-13899 |
| 4 | 36.134 | -80.222 | 295.7 | 723190-93807,723191-93807,723193-93807,999999-93807 |
| 5 | 37.087 | -84.077 | 369.1 | 723290-03849,723291-03849,723293-03849,724243-03849,999999-03849 |
| 5 | 38.417 | -113.017 | 1534.1 | 745200-23176,745201-23176,999999-23176,724757-23176,724797-23176 |
| 4 | 40.217 | -76.851 | 105.8 | 999999-14751,725110-14751,725111-14751,725118-14751 |
+-----+--------+----------+--------+-------------------------------------------------+
This returns a concatenated list of stations that are located at identical coordinates. However, I am only interested in concatenating stations with adjoining date ranges. The table that I select from (ISH-HISTORY_HASPOS) has two datetime columns : 'begin' and 'end'. I need the values for these two columns to be within 3 days of each other to satisfy the GROUP_CONCAT conditions.
Edit: In order for a station to be included in the final result's GROUP_CONCAT it must satisfy the following conditions:
It must be co-located with another station in the list (group by
lat,lon,elev)
Its end time must be within 3 days of another station's begin time OR its begin time must be within 3 days of another station's
end time. When I say "another station", I am referring to stations
that are co-located (meet the conditions for #1).
I figure that I will have to use a subquery but I can't seem to figure out how to do it. Some help would be greatly appreciated! Either a query or a stored procedure would be great but a php solution would also be acceptable.
Here is a dump of the table that I am querying:sql dump
The results should look the same as my example, but non-adjoining items (date-wise) should not be there.
A solution could be using a subquery to compute the list of station within 3 days of each other and adding this subquery as a where clause to the main query.
The subquery consists of a cartesian product to list all possible station couples with a first condition to get just the first half of the resulting matrix and two conditions to specify the time constraints. As to these latter conditions I just guessed them, I don't really know the begin and end fields unit of measure.
The resulting query could be this:
SELECT * FROM (
SELECT COUNT(*) AS
cnt,
lat,
lon,
elev,
GROUP_CONCAT(CONCAT(usaf, '-', wban))
FROM ISH-HISTORY_HASPOS
WHERE id IN (
SELECT DISTINCT t1.id
FROM ISH-HISTORY_HASPOS t1
INNER JOIN ISH-HISTORY_HASPOS t2
ON t1.lon = t2.lon
AND t1.lat = t2.lat
AND t1.elev = t2.elev
WHERE t1.id < t2.id
AND abs(t1.begin - t2.end) < 259200
AND abs(t1.end - t2.begin) < 259200
UNION
SELECT DISTINCT t2.id
FROM ISH-HISTORY_HASPOS t1
INNER JOIN ISH-HISTORY_HASPOS t2
ON t1.lon = t2.lon
AND t1.lat = t2.lat
AND t1.elev = t2.elev
WHERE t1.id < t2.id
AND abs(t1.begin - t2.end) < 259200
AND abs(t1.end - t2.begin) < 259200
)
GROUP BY lat, lon, elev
) AS x WHERE cnt >= 1;
I only have access and knowledge of SQL Server so I can't get your data to work and I don't know if MySQL has the equivalent functionality but here is a verbal description of what you need to do.
You need a recursive statement (WITH CTE in SQL Server) to join the table to itself on lat, lon, elev and begin BETWEEN end -3 AND end +3. You will need to be careful not to get caught in an infinite loop - I suggest building a comma seperated list of the IDs you have visited and checking this as you go. Its painful but keep this list in ID order becuase it is what you will need to group on at the end. You also need to keep track of your depth and the original id.
Something like ...
WITH cte(id, idlist, lat, lon, elev, starts, ends)
AS (
SELECT id, CAST(id AS varchar), lat, lon, elev, starts, ends
FROM `ISH-HISTORY_HASPOS`
UNION ALL
SELECT i.id, FunctionToManagetheList(i.idlist, cte.id), lat, lon, elev, starts, ends
FROM `ISH-HISTORY_HASPOS` i
INNER JOIN
cte ON i.lat=cte.lat AND
i.lon=cte.lon AND
i.elev=cte.elev AND
NOT FunctionToCheckIfTheIDisintheLitst(i.id, cte.idlist)
)
SELECT stuffyouneed
FROM `ISH-HISTORY_HASPOS` i
INNER JOIN
(SELECT id, MAX(depth) AS MaxDepth
FROM cte
GROUP BY id) cte1 ON i.id=cte.id
INNER JOIN
cte cte2 ON cte1.id=cte2.id AND cte1.MaxDepth=cte2.Depth
GROUP BY cte.idlist