I have a table with 100 rows with double as data
a1 a2 a3...
---------
1 2 3
23 55 4
2 3 7
I am planning to use UNION ALL to make that table bigger
a1 a2 a3...
---------
1 2 3
23 55 4
2 3 7
1 2 3
23 55 4
2 3 7
1 2 3
23 55 4
2 3 7
This is for testing purposes so what do you recommend, what would be the most efficient way to do this?
This will increase the size of your table exponentially... First it inserts x records, then 2x, then 4x, then 8x... You could add distinct or top n, etc. to the select if you just want to add the same number of records each time.
DECLARE #count int
DECLARE #max int
SET #count int = 1
SET #max = 10
WHILE #count < #max
BEGIN
INSERT INTO myTable (a1, a2, a3)
SELECT a1, a2, a3 FROM myTable
SET #count = #count + 1
END
BTW -- not sure what you're trying to test, but you might add something besides whole numbers to your data set -- e.g., 1.01, .99, 55.7, 60, etc.
EDIT
Per your comment -- if you really want to use union all then...
INSERT INTO myTABLE (a1, a2, a3)
SELECT a1, a2, a3 FROM
(
SELECT a1, a2, a3 FROM myTable
UNION ALL
SELECT a1, a2, a3 FROM myTable
UNION ALL
SELECT a1, a2, a3 FROM myTable
...
) a
INSERT dbo.Table
SELECT /* TOP (n) */ t1.a1, t1.a2, t1.a3
FROM dbo.Table AS t1
CROSS JOIN dbo.Table AS t2
-- repeat CROSS JOINs as necessary
The first cross join will square, second cross join will cube, etc. Luke #kuru's answer you can limit the number of rows added using TOP in case you don't want to do the math.
Related
I have the following table ordered by val. I would like to remove all rows that share the top and bottom x and y distinct values in the source column.
If x is 1 and y is 2, then on this table:
val
source
1
1
2
3
3
1
3
2
4
4
5
3
7
4
7
5
9
5
The result should be:
val
source
2
3
3
2
5
3
Where 2 rows was were removed because the top row source = 1 and 4 rows were removed because the bottom 2 distinct values in source was 4 and 5.
How could I achieve this result?
WITH
cte1 AS (
SELECT val,
source,
COALESCE(source <> LAG(source) OVER (ORDER BY val), 1) like_prev,
COALESCE(source <> LEAD(source) OVER (ORDER BY val), 1) like_next
FROM test
),
cte2 AS (
SELECT val,
source,
SUM(like_prev) OVER (ORDER BY val) sum_prev,
SUM(like_next) OVER (ORDER BY val DESC) sum_next
FROM cte1
)
DELETE test
FROM test
JOIN cte2 USING (source)
WHERE cte2.sum_prev <= #x
OR cte2.sum_next <= #y;
https://dbfiddle.uk/1bRF9BpU (the values in val are made unique).
I have the following data
ReasonId Team Division Location
2 A L1
3 B D1 L2
2 A D2 L1
2 A D3 L3
I want to show the count grouped by the ReasonId for each team,division & location. There could be instances where division could be null.
I am trying something like this,
SELECT
COUNT(*) AS TotalRequests, Reason, team
FROM
reports
GROUP BY Reason , team
UNION SELECT
COUNT(*) AS TotalRequests, Reason, location
FROM
reports
GROUP BY Reason , location
UNION SELECT
COUNT(*) AS TotalRequests, Reason, division
FROM
reports
WHERE
ISNULL(division) = 0
GROUP BY Reason , division
;
The output I am getting for the above is,
TotalRequests Reason team
1 2
3 2 A
1 3 B
1 3 D1
1 2 D2
1 2 D3
2 2 L1
1 3 L2
1 2 L3
Is it possible to get an output that looks like this,
ReasonId Team TotalByTeam Location TotalByLocation Division TotalByDivision
2 A 3 L1 2 0
2 A 3 L3 1 D2 1
2 A 3 L3 1 D3 1
3 B 1 L2 1 D1 1
I am using mysql 8.0.17 Here's a sample schema and dbfiddle of same
CREATE TABLE `reports` (
`Reason` int(11) DEFAULT NULL,
`Team` text,
`Division` text,
`Location` text
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO reports (Reason,Team,Division,Location) values (2, 'A',null,'L1');
INSERT INTO reports (Reason,Team,Division,Location) values (3, 'A','D1','L2');
INSERT INTO reports (Reason,Team,Division,Location) values (2, 'A','D2','L1');
INSERT INTO reports (Reason,Team,Division,Location) values (2, 'A','D3','L3');
You should use analytic functions COUNT(...) OVER (...) for this. They are available in MySQL since version 8.0.
select
reasonid,
team,
count(team) over (partition by team) as total_by_team,
location,
count(location) over (partition by location) as total_by_location,
division,
count(division) over (partition by division) as total_by_division
from reports;
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=79891554331e8222041ec34eea3fc4ee
Try this below script-
Demo Here
SELECT A.ReasonId,
A.Team,
(SELECT COUNT(*) FROM your_table B WHERE B.ReasonId = A.ReasonId AND B.Team = A.Team) TotalByTeam,
A.Division,
(SELECT COUNT(*) FROM your_table B WHERE B.ReasonId = A.ReasonId AND B.Division = A.Division) TotalByDivision,
A.Location,
(SELECT COUNT(*) FROM your_table B WHERE B.ReasonId = A.ReasonId AND B.Location = A.Location) TotalByLocation
FROM your_table A
I have a Table A as below
id (integer)
follow_up (integer, days under observation)
matched_id (integer)
id ; follow_up ; matched_id
1 ; 10 ; 19
1 ; 10 ; 20
1 ; 10 ; 21
2 ; 5 ; 22
2 ; 5 ; 23
2 ; 5 ; 24
2 ; 5 ; 19
2 ; 5 ; 20
3 ; 6 ; 25
3 ; 6 ; 26
3 ; 6 ; 27
4 ; 7 ; 19
4 ; 7 ; 28
4 ; 7 ; 29
I would like to limit to 2 records per id, and the records should be randomly picked up and be exclusive for each id. For, example
matched_id: "19" and "20" were given to id:1, then "19" and "20" should not be given to id:2
matched_id: "19" was given to id:1, then "19" should not be given to id:4
and so on for the rest of the table.
require output
id ; follow_up ; matched_id
1 ; 10 ; 19
1 ; 10 ; 20
2 ; 5 ; 22
2 ; 5 ; 23
3 ; 6 ; 25
3 ; 6 ; 26
4 ; 7 ; 28
4 ; 7 ; 29
Please help me. Thank you so much!
This is a very good and very challenging SQL question.
You have a very challenging set of requirements:
1. No matched_id should appear more than once in the result set
2. No ID be given more than two matches
3. The matching be random
We will stick to a pure SQL solution, assuming that you can't return, say, a larger result set and do some filtering using business logic in your implementation language.
First, let's tackle random assignment. Randomly ordering items inside of groups is a fun question. I decided to tackle it by ordering on a SHA1 hash of the data in the row (id, follow_up, matched_id), which will give a repeatable result with a feeling of randomness. (This would be best if there were a column that contained the date/time created or modified.)
SELECT * FROM
(
SELECT
a.id,
a.follow_up,
a.matched_id,
a.rank_hash,
count(*) rank
FROM
(SELECT *, SHA1(CONCAT(id, follow_up, matched_id)) rank_hash FROM TableA) a
JOIN
(SELECT *, SHA1(CONCAT(id, follow_up, matched_id)) rank_hash FROM TableA) b
ON a.rank_hash >= b.rank_hash
AND a.id = b.id
GROUP BY a.id, a.matched_id
ORDER BY a.id, rank
) groups
WHERE rank <= 2
GROUP BY matched_id
This might suffice for your use case if there are sufficient matched_id values for each id. But what if there is a hidden fourth requirement:
4. If possible, an ID should receive a match.
In other words, what if, as a result of random shuffling, a matched_id was assigned to an id that had several other matches, but further down the result set it was the only match for an id? An optimal solution in which every ID were matched with a matched_id was possible, but it never happened because all the matched_ids were used up earlier in the process?
For example:
CREATE TABLE TableA
(`id` int, `follow_up` int, `matched_id` varchar(1))
;
INSERT INTO TableA
(`id`, `follow_up`, `matched_id`)
VALUES
(1, 10, 'A'),
(1, 10, 'B'),
(1, 10, 'C'),
(2, 5, 'D'),
(2, 5, 'E'),
(2, 5, 'F'),
(3, 5, 'C')
;
In the above set, if IDs and their matches are assigned randomly, if ID 1 gets assigned matched_id C, then ID 3 will not get a matched_id at all.
What if we first find out how many matches an ID received, and order by that first?
SELECT
a.*,
frequency
FROM TableA a
JOIN
( SELECT
matched_id,
count(*) frequency
FROM
TableA
GROUP BY matched_id
) b
ON a.matched_id = b.matched_id
GROUP BY a.matched_id
ORDER BY b.frequency
This is where a middleman programming language might come in handy to help limit the result set.
But note that we also lost our requirement of randomness! As you can see, a pure SQL solution might get pretty ugly. It is indeed possible combining the techniques outlined above.
Hopefully this will get your imagination firing.
Along with RAND() and MySQL user defined variables you can achieve this:
SELECT
t.id,
t.follow_up,
t.matched_id
FROM
(
SELECT
randomTable.*,
IF(#sameID = id, #rn := #rn + 1,
IF(#sameID := id, #rn := 1, #rn := 1)
) AS rowNumber
FROM
(
SELECT
*
FROM tableA
ORDER BY id, RAND()
) AS randomTable
CROSS JOIN (SELECT #sameID := 0, #rn := 0) var
) AS t
WHERE t.rowNumber <= 2
ORDER BY t.id
See Demo
Here's a solution for the specific problem given. It does not scale!
SELECT *
FROM
( SELECT a.matched_id m1
, b.matched_id m2
, c.matched_id m3
, d.matched_id m4
FROM my_table a
JOIN my_table b
ON b.matched_id NOT IN(a.matched_id)
JOIN my_table c
ON c.matched_id NOT IN(a.matched_id,b.matched_id)
JOIN my_table d
ON d.matched_id NOT IN(a.matched_id,b.matched_id,c.matched_id)
WHERE a.id = 1
AND b.id = 2
AND c.id = 3
AND d.id = 4
) x
JOIN
( SELECT a.matched_id n1
, b.matched_id n2
, c.matched_id n3
, d.matched_id n4
FROM my_table a
JOIN my_table b
ON b.matched_id NOT IN(a.matched_id)
JOIN my_table c
ON c.matched_id NOT IN(a.matched_id,b.matched_id)
JOIN my_table d
ON d.matched_id NOT IN(a.matched_id,b.matched_id,c.matched_id)
WHERE a.id = 1
AND b.id = 2
AND c.id = 3
AND d.id = 4
) y
ON y.n1 NOT IN(x.m1,x.m2,x.m3,x.m4)
AND y.n2 NOT IN(x.m1,x.m2,x.m3,x.m4)
AND y.n3 NOT IN(x.m1,x.m2,x.m3,x.m4)
AND y.n4 NOT IN(x.m1,x.m2,x.m3,x.m4)
ORDER
BY RAND() LIMIT 1;
+----+----+----+----+----+----+----+----+
| m1 | m2 | m3 | m4 | n1 | n2 | n3 | n4 |
+----+----+----+----+----+----+----+----+
| 20 | 24 | 27 | 29 | 21 | 23 | 26 | 28 |
+----+----+----+----+----+----+----+----+
So, in this example, the pairs are:
id1: 20,21
id2: 24,23
id3: 27,26
id4: 29,28
I need to create a table function that produces a parameter up to a specified number in column 1 always starting from 1. In column 2, if column 1 is divisible by 5 it will say 'Div5' otherwise NULL.
So as an example. I specify column 1 will stop at 5 the end result will look as follows;
1 NULL
2 NULL
3 NULL
4 NULL
5 Div5
I can create the function, but I'm not sure how to create the conditional first column, or how to say if column 2 divided by 5 is an integer then 'Div5' if it's a decimal then NULL;
create function MyFunction ()
Returns #Division Table
(Ind int ,
Div5 varchar(30))
AS
begin
Insert Into #Division (Ind, Div5)
select ???,???
Return;
End;
I hope this gives enough detail?
Thank you :)
This should do the trick:
DECLARE #divisor INT = 10, #limit INT = 100;
WITH
L0 AS(SELECT 1 AS C UNION ALL SELECT 1 AS O),
L1 AS(SELECT 1 AS C FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS(SELECT 1 AS C FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS(SELECT 1 AS C FROM L2 AS A CROSS JOIN L2 AS B),
L4 AS(SELECT 1 AS C FROM L3 AS A CROSS JOIN L3 AS B),
L5 AS(SELECT 1 AS C FROM L4 AS A CROSS JOIN L4 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS N FROM L5)
SELECT N, CASE WHEN N % #divisor = 0 THEN 'Div' + CAST(#divisor AS VARCHAR(100)) ELSE NULL END AS Col2 FROM Nums
WHERE N <= #limit
The 2 variables determine what number you're looking for the first column to be divisible by, the second for how far you want to go, the next bit is just a CTE to generate the numbers for the first column (numbers tables are really useful for loads of stuff like this). Then it's just selecting all the numbers from the numbers table up to your limit and a case expression to check whether it's divisible by the number you specify (remainder 0) and a bit of string concatenation for the DivX bit.
You should easily be able to integrate this logic into your function.
You are looking for the Modulo operator that basically returns the remainder of a division problem.
DECLARE #SOMETBL TABLE (ROWNUM INT, DIVSTATUS CHAR(4))
INSERT #SOMETBL
(ROWNUM)
SELECT 1
UNION
SELECT 5
UNION
SELECT 2
UNION
SELECT 10
UPDATE #SOMETBL
SET DIVSTATUS = CASE WHEN ROWNUM%5 > 0 THEN NULL ELSE 'DIV5' END
SELECT * FROM #SOMETBL
My table looks like:
[Number] [Value1]
1234567 8
1234567C 7
9876543 1
9876543C 2
5555555 3
5555555C 3
I want to search the entries for same values in the first column (except the "C" in the end of the number) and set the higher value in the second column to the lower one.
There are always only two same values (one with "C") and some pairs have same values in the second column and some have different.
The result of the query should be:
Number Value1
1234567 7
1234567C 7
9876543 1
9876543C 1
5555555 3
5555555C 3
The following is not an ideal solution but should do what you want:
update yourTable
set value1 = (
select min(value1) from (
select * from yourTable
) as x
where yourTable.number = x.number + 'C');
I have tested it with this in mysql workbench:
create table yourTable(number varchar (10),value1 int);
insert into yourTable Values('1234567',8);
insert into yourTable Values('1234567C',7);
insert into yourTable Values('9876543',1);
insert into yourTable Values('9876543C',2);
insert into yourTable Values('5555555',3);
insert into yourTable Values('5555555C',3);
insert into yourTable Values('55555556',10);
insert into yourTable Values('55555556C',2);
Then select * from yourTable;will return:
1234567 8
1234567C 7
9876543 1
9876543C 2
5555555 3
5555555C 3
55555556 10
55555556C 2
After the update select * from yourTable; will return:
1234567 7
1234567C 7
9876543 1
9876543C 1
5555555 3
5555555C 3
55555556 2
55555556C 2
Hope that is what you wanted :)
Actually, you don't need any checking, since there are only 2 values (and thus the query is even simpler):
UPDATE
table
SET
Value1 =
(
SELECT
MAX(Value1)
FROM
table t
WHERE
table.Number = t.Number
OR table.Number = t.Number + 'C'
)
WHERE
RIGHT(Number, 1) != 'C'