I don't have a profound SQL background, recently encountered a problem with SQL that seems hard to do with JUST SQL.
I have a table
```
IMEI | DATE | A_1 | A_2 | A_3 | B_1 | B_2 | B_3
2132 | 09/21| 2 | 4 | 4 | 5 | 2 | 4
4535 | 09/22| 2 | 2 | 4 | 5 | 2 | 3
9023 | 09/21| 2 | 1 | 5 | 7 | 2 | 2
```
How can I group value of A_1, A_2 etc in a way so I can achieve the this table. Basically, I would like to group certain columns in my table, and put them into different rows.
IMEI | DATE | MODULE | val_1 | val_2 | val_3
2132 | 09/21| A | 2 | 4 | 4
2132 | 09/21| B | 5 | 2 | 4
...
The goal is to have value under namespace A, B etc for a row to be separated into different rows in the new Table.
Also, any suggestions on where can I improve my SQL. any books I should keep as reference or any other resources I should use?
Thanks!
You can do it with UNION:
SELECT IMEI, DATE, 'A' AS MODULE, A_1 AS val_1, A_2 AS val_2, A_3 AS val_3
FROM myTable
UNION ALL
SELECT IMEI, DATE, 'B', B_1, B_2, B_3
FROM myTable
See it on sqlfiddle.
But really, you should store your data in the form created by the above query, and then use a JOIN to create the original format as/when desired.
I love playing with data and questions like this!
Below can be considered as over-engineering but I think it is still an option when you don't know your columns names in advance but have a pattern you described or it can be useful just for learning as looks like you are looking for improving your SQL (based on tag for this question I assume you meant BigQuery SQL)
#standardSQL
WITH parsed AS (
SELECT IMEI, DATE,
REGEXP_REPLACE(SPLIT(row, ':')[OFFSET(0)], r'^"|"$', '') key,
REGEXP_REPLACE(SPLIT(row, ':')[OFFSET(1)], r'^"|"$', '') value
FROM `yourTable` t,
UNNEST(SPLIT(REGEXP_REPLACE(to_json_string(t), r'[{}]', ''))) row
),
grouped AS (
SELECT
IMEI, DATE,
REGEXP_EXTRACT(key, r'(.*)_') MODULE,
ARRAY_AGG(value ORDER BY CAST(REGEXP_EXTRACT(key, r'_(.*)') AS INT64)) AS vals
FROM parsed
WHERE key NOT IN ('IMEI', 'DATE')
GROUP BY IMEI, DATE, MODULE
)
SELECT IMEI, DATE, MODULE,
vals[SAFE_OFFSET(0)] AS val_1,
vals[SAFE_OFFSET(1)] AS val_2,
vals[SAFE_OFFSET(2)] AS val_3,
vals[SAFE_OFFSET(3)] AS val_4
FROM grouped
-- ORDER BY IMEI, DATE, MODULE
You can test / play with dummy data from your question
#standardSQL
WITH `yourTable` AS (
SELECT 2132 IMEI, '09/21' DATE, 2 A_1, 4 A_2, 4 A_3, 5 B_1, 2 B_2, 4 B_3 UNION ALL
SELECT 4535, '09/22', 2, 2 ,4, 5, 2, 3 UNION ALL
SELECT 9023, '09/21', 2, 1 ,5, 7, 2, 2
),
parsed AS (
SELECT IMEI, DATE,
REGEXP_REPLACE(SPLIT(row, ':')[OFFSET(0)], r'^"|"$', '') key,
REGEXP_REPLACE(SPLIT(row, ':')[OFFSET(1)], r'^"|"$', '') value
FROM `yourTable` t,
UNNEST(SPLIT(REGEXP_REPLACE(to_json_string(t), r'[{}]', ''))) row
),
grouped AS (
SELECT
IMEI, DATE,
REGEXP_EXTRACT(key, r'(.*)_') MODULE,
ARRAY_AGG(value ORDER BY CAST(REGEXP_EXTRACT(key, r'_(.*)') AS INT64)) AS vals
FROM parsed
WHERE key NOT IN ('IMEI', 'DATE')
GROUP BY IMEI, DATE, MODULE
)
SELECT IMEI, DATE, MODULE,
vals[SAFE_OFFSET(0)] AS val_1,
vals[SAFE_OFFSET(1)] AS val_2,
vals[SAFE_OFFSET(2)] AS val_3,
vals[SAFE_OFFSET(3)] AS val_4
FROM grouped
ORDER BY IMEI, DATE, MODULE
Output will be as below
Row IMEI DATE MODULE val_1 val_2 val_3 val_4
1 2132 09/21 A 2 4 4 null
2 2132 09/21 B 5 2 4 null
3 4535 09/22 A 2 2 4 null
4 4535 09/22 B 5 2 3 null
5 9023 09/21 A 2 1 5 null
6 9023 09/21 B 7 2 2 null
Related
I've got a table that looks something like the following
Date
Key
Metric
2021-01-31
A
6
2021-02-28
A
3
2021-05-31
A
3
2021-03-31
B
4
2021-04-30
B
1
2021-05-31
B
2
What I'd like to do is insert a row with a metric of 0 for Key A for the date of 2021-03-31, since Key A had already appeared in January in February.
Key B, on the other hand, would ideally stay untouched since it has metrics associated with every date after its appearance. (The table I'm working with happens to be monthly, but I'm sure I could make the changes to make a daily solution work here)
So, Ideally we'd end up with a table looking like the following
Date
Key
Metric
2021-01-31
A
6
2021-02-28
A
3
2021-03-31
A
0
2021-04-30
A
0
2021-05-31
A
3
2021-03-31
B
4
2021-04-30
B
1
2021-05-31
B
2
That's all for now, thank you very much everyone
Fiddle for MySQL 8.0+
There are various ways to do this. The following assumes MySQL 8.0 or better, but less convenient solutions exist for versions prior to this.
The following assumes (xkey, xdate) is unique or the primary key of our table.
CTE term
Description
cte1
Determine the range of dates in our full sequence
cte2
Generate the full list of last dates in each month of the range
cte3
Generate a list of (xkey, MIN(xdate)) pairs
Final
Now generate every potential new row to insert
The INSERT IGNORE only inserts rows which do not already exist, based on the primary key. If a constraint violation would occur, IGNORE will skip that row.
INSERT IGNORE INTO test
WITH RECURSIVE cte1 (min_date, max_date) AS (
SELECT MIN(xdate) AS min_date
, MAX(xdate) AS max_date
FROM test
)
, cte2 (xdate, max_date) AS (
SELECT min_date, max_date FROM cte1
UNION ALL
SELECT LAST_DAY(xdate + INTERVAL 5 DAY), max_date
FROM cte2 WHERE xdate < max_date
)
, cte3 (xkey, xdate) AS (
SELECT xkey, MIN(xdate) FROM test GROUP BY xkey
)
SELECT cte2.xdate, xkey, 0 FROM cte2 JOIN cte3 ON cte3.xdate < cte2.xdate
;
SELECT * FROM test ORDER BY xkey, xdate;
+------------+------+--------+
| xdate | xkey | metric |
+------------+------+--------+
| 2021-01-31 | A | 6 |
| 2021-02-28 | A | 3 |
| 2021-03-31 | A | 0 |
| 2021-04-30 | A | 0 |
| 2021-05-31 | A | 3 |
| 2021-03-31 | B | 4 |
| 2021-04-30 | B | 1 |
| 2021-05-31 | B | 2 |
+------------+------+--------+
The setup:
CREATE TABLE test ( xdate date, xkey varchar(10), metric int, primary key (xdate, xkey) );
INSERT INTO test VALUES
( '2021-01-31', 'A', 6 )
, ( '2021-02-28', 'A', 3 )
, ( '2021-05-31', 'A', 3 )
, ( '2021-03-31', 'B', 4 )
, ( '2021-04-30', 'B', 1 )
, ( '2021-05-31', 'B', 2 )
;
I have two tables that look like this:
Table 1
Type 1 | Type 2 | Type 3 | ...
1 | 3 | 0 | ...
Table 2
Type 1 | Type 2 | Type 3 | ...
3 | 2 | 1 | ...
I would like to combine them into a temporary table like this:
Temporary Table
UID | Type | Table
1 | Type 1 | 1
2 | Type 2 | 1
3 | Type 2 | 1
4 | Type 2 | 1
7 | Type 1 | 2
8 | Type 1 | 2
9 | Type 1 | 2
10 | Type 2 | 2
11 | Type 2 | 2
Essentially, the numbers in tables 1 and 2 are totals and I want to break them out into individual rows in this temporary table.
I started going down the path of selecting from both tables and storing the values into temporary variables. I was then going to loop through every single variable and insert into the temporary table. But I have about 15 columns per table and there has got to be an easier way of doing this. I just don't know what it is.
Does anyone have any insight on this? My knowledge is incredibly limited on MySql stored procedures.
Not sure of an easy way to do this. One option would be to have a numbers table. Heres a quick approach to getting 1-10 in a common-table-expression (change as needed).
Then you could join to each table and each type, using union all for each subset. Here is a condensed version:
with numbers as (select 1 n union all select 2 union all
select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all
select 9 union all select 10)
select 'type1' as type, '1' as tab
from numbers n join table1 t on n.n <= t.type1
union all
select 'type2' as type, '1' as tab
from numbers n join table1 t on n.n <= t.type2
union all
select 'type1' as type, '2' as tab
from numbers n join table2 t on n.n <= t.type1
union all
select 'type2' as type, '2' as tab
from numbers n join table2 t on n.n <= t.type2
Demo Fiddle
I have a booking table where all the service booking list where booking details saved like this:
id user_id booking_date booking_id
1 3 2017-01-10 booking1
2 3 2017-01-11 booking1
3 3 2017-01-12 booking1
4 3 2017-01-13 booking1
5 3 2017-01-14 booking1
6 4 2017-01-19 booking2
7 4 2017-01-20 booking2
8 4 2017-01-21 booking2
9 4 2017-01-22 booking2
10 3 2017-02-14 booking3
11 3 2017-02-15 booking3
I want to get a start and end date of booking that came in a row.
like for user_id 3 has 2 date range of booking date
from `2017-01-10 to 2017-01-14`
and then after some records
from `2017-02-14 to 2017-02-15`
First of all, I don't think that getting sequences like that does make sense. ... But, ok.
To do this in one Query would be compicated with that data. So I would first add some column like "group_id" or "order_id". So you can save one ID to all orders that belong together.
Just iterate over the Table, ascending by ID and check if the next (or last) data has the same user_id.
When you do have the order_id column, you can simple
SELECT MIN(booking_date), MAX(booking_date) FROM table GROUP BY order_id
Ok, nobody says it is easy ... let's go. This is a gap and island problem. let me say it is mooooore easy to solve in postges sql
I apply mysql variables to your scenario.
I solve it on SQL Fiddle:
MySQL 5.6 Schema Setup:
create table t ( user_id int, booking_date date );
insert into t values
( 3, '2017-01-10'),
( 3, '2017-01-11'),
( 3, '2017-01-12'),
( 3, '2017-01-13'),
( 3, '2017-01-14'),
( 4, '2017-01-19'),
( 4, '2017-01-20'),
( 4, '2017-01-21'),
( 4, '2017-01-22'),
( 3, '2017-02-14'),
( 3, '2017-02-15');
Query 1:
select user_id, min(booking_date), max(booking_date)
from (
select t1.user_id,
t1.booking_date,
#g := case when(
DATE_ADD(#previous_date, INTERVAL 1 DAY) <> t1.booking_date or
#previous_user <> t1.user_id )
then t1.booking_date
else #g
end as g,
#previous_user:= t1.user_id,
#previous_date:= t1.booking_date
from t t1, ( select
#previous_user := -1,
#previous_date := STR_TO_DATE('01/01/2000', '%m/%d/%Y'),
#g:=STR_TO_DATE('01/01/2000', '%m/%d/%Y') ) x
order by user_id, booking_date
) X
group by user_id, g
Results:
| user_id | min(booking_date) | max(booking_date) |
|---------|-------------------|-------------------|
| 3 | 2017-01-10 | 2017-01-14 |
| 3 | 2017-02-14 | 2017-02-15 |
| 4 | 2017-01-19 | 2017-01-22 |
Explanation nested query figure up a group code ( g ) for each range. The external query get the max and the min for each group.
In my table, I have these two columns called year and season that i'd like to sort by. Some example of their values might be
----------------------------
| id | etc | year | season |
| 0 | ... | 2016 | FALL |
| 1 | ... | 2015 | SPRING |
| 2 | ... | 2015 | FALL |
| 3 | ... | 2016 | SPRING |
----------------------------
How would I go about performing a select where I get the results as such?
| 1 | ... | 2015 | SPRING |
| 2 | ... | 2015 | FALL |
| 3 | ... | 2016 | SPRING |
| 0 | ... | 2016 | FALL |
The easy part would be ORDER BY table.year ASC, but how do I manage the seasons now? Thanks for any tips!
You can do this:
SELECT *
FROM yourtable
ORDER BY year, CASE WHEN season = 'spring' THEN 0 ELSE 1 END;
If you want to do the same for the other two seasons, you can do the same using CASE, but it will be much easier and more readable to use a table something like this:
SELECT t1.*
FROM yourtable AS t1
INNER JOIN
(
SELECT 'spring' AS season, 0 AS sortorder
UNION
SELECT 'Fall' AS season, 1 AS sortorder
UNION
SELECT 'Winter' AS season, 2 AS sortorder
UNION
SELECT 'summer' AS season, 3 AS sortorder
) AS t2
ORDER BY t1.year, t2.season;
If you want to order by all four seasons, starting with Spring, extend your CASE statement:
ORDER BY CASE season
WHEN 'spring' then 1
WHEN 'summer' then 2
WHEN 'fall' then 3
WHEN 'autumn' then 3
WHEN 'winter then 4
ELSE 0 -- Default if an incorrect value is entered. Could be 5
END
Alternately, to handle all possible cases, you might want to build a table with the season name and a sort order. Say, for example, some of your data was in german. You could have a table - SeasonSort - with the fields SeasonName and SortOrder. Then add data:
CREATE TABLE SeasonSort (SeasonName nvarchar(32), SortOrder tinyint)
INSERT INTO SeasonSort (SeasonName, SortOrder)
VALUES
('spring', 1),
('frühling', 1),
('fruhling', 1), -- Anglicized version of German name
('summer', 2),
('sommer', 2),
('fall', 3),
('autumn', 3),
('herbst', 3),
('winter', 4) -- same in English and German
Then your query would become:
SELECT t.*
FROM MyTable t
LEFT JOIN seasonSort ss
ON t.season = ss.SeasonName
ORDER BY t.Year,
isnull(ss.SortOrder, 0)
I have a fairly big table (10,000+ records) that looks more or less like this:
| id | name | contract_no | status |
|----|-------|-------------|--------|
| 1 | name1 | 1022 | A |
| 2 | name2 | 1856 | B |
| 3 | name3 | 1322 | C |
| 4 | name4 | 1322 | C |
| 5 | name5 | 1322 | D |
contract_no is a foreign key which of course can appear in several records and each record will have a status of either A, B, C, D or E.
What I want is to get a list of all the contract numbers, where ALL the records referencing that contract are in status C, D, E, or a mix of those, but if any of the records are in status A or B, omit that contract number.
Is it possible to do this using a SQL query? Or should I better export the data and try to run this analysis using another language like Python or R?
Post aggregate filtering should do the trick
SELECT contract_no FROM t
GROUP BY contract_no
HAVING SUM(status='A')=0
AND SUM(status='B')=0
You can use group by with having to get such contract numbers.
select contract_number
from yourtable
group by contract_number
having count(distinct case when status in ('C','D','E') then status end) >= 1
and count(case when status = 'A' then 1 end) = 0
and count(case when status = 'B' then 1 end) = 0
Not that elegant as the other two answers, but more expressive:
SELECT DISTINCT contract_no
FROM the_table t1
WHERE NOT EXISTS (
SELECT *
FROM the_table t2
WHERE t2.contract_no = t1.contract_no
AND t2.status IN ('A', 'B')
)
Or
SELECT DISTINCT contract_no
FROM the_table
WHERE contract_no NOT IN (
SELECT contract_no
FROM the_table
AND status IN ('A', 'B')
)