MySQL: Select from list of values - mysql

I was wondering if I could select given values from a list and populate rows? For example, SELECT 1 as one, 2 as two, 3 as three will populate columns:
one | two | three
------------------------
1 | 2 | 3
I'm looking for a script that populates rows, something like:
values
-------
1
2
3
4
Thanks!

you can union each one if you want like so
SELECT 1 AS numbers
UNION SELECT 2
UNION SELECT 3
a much simpler way to do something like this would be to make a table with an auto incremented id... insert into another column in the table an empty string... then just select the auto incremented id
CREATE TEMPORARY TABLE tmp (
id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
val varchar(1)
);
INSERT INTO tmp (val)
values
(""),
(""),
(""),
(""),
(""),
(""),
(""),
(""),
(""),
("");
select id from tmp;
DEMO

To get a few numbers, the approach from John Ruddell is the probably the most convenient, I can easily incorporate an inline view in any query I'm needing to run.
When I need a lot of numbers, for example, 1 through 4000, I can do something like this:
CREATE TABLE digit (d INT(11) NOT NULL PRIMARY KEY);
INSERT INTO digit (d) VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
SELECT thousands.d*1000+hundreds.d*100+tens.d*10+ones.d+1 AS n
FROM digit ones
CROSS
JOIN digit tens
CROSS
JOIN digit hundreds
CROSS
JOIN digit thousands
WHERE thousands.d < 4
I can also add a HAVING clause if the boundaries of the numbers I need aren't quite as neat, e.g
HAVING n >= 121
AND n <= 2499
If I want to ensure the "numbers" are returned in order, I'll add an ORDER BY clause:
ORDER BY n

Related

How to get the first record of each type in sequence?

Table Data:
ID
Type
1
A
2
A
3
B
4
A
5
A
6
B
7
B
8
A
9
A
10
A
How to get only rows with IDs 1,3,4,6,8, or the first records on type-change by single query?
We were doing this in code using multiple queries and extensive processing especially for large data, is there a way to do this in a single query?
Use LAG() window function to get for every row the previous row's type and compare it to the current type.
Create a flag column that is true if the 2 types are different and use it to filter the table:
WITH cte AS (
SELECT *, type <> LAG(type, 1, '') OVER (ORDER BY id) flag
FROM tablename
)
SELECT * FROM cte WHERE flag;
I assume that the column type does not contain empty values (nulls or
empty strings).
See the demo.

select one row multiple time when using IN()

I have this query :
select
name
from
provinces
WHERE
province_id IN(1,3,2,1)
ORDER BY FIELD(province_id, 1,3,2,1)
the Number of values in IN() are dynamic
How can I get all rows even duplicates ( in this example -> 1 ) with given ORDER BY ?
the result should be like this :
name1
name3
name2
name1
plus I shouldn't use UNION ALL :
select * from provinces WHERE province_id=1
UNION ALL
select * from provinces WHERE province_id=3
UNION ALL
select * from provinces WHERE province_id=2
UNION ALL
select * from provinces WHERE province_id=1
You need a helper table here. On SQL Server that can be something like:
SELECT name
FROM (Values (1),(3),(2),(1)) As list (id) --< List of values to join to as a table
INNER JOIN provinces ON province_id = list.id
Update: In MySQL Split Comma Separated String Into Temp Table can be used to split string parameter into a helper table.
To get the same row more than once you need to join in another table. I suggest to create, only once(!), a helper table. This table will just contain a series of natural numbers (1, 2, 3, 4, ... etc). Such a table can be useful for many other purposes.
Here is the script to create it:
create table seq (num int);
insert into seq values (1),(2),(3),(4),(5),(6),(7),(8);
insert into seq select num+8 from seq;
insert into seq select num+16 from seq;
insert into seq select num+32 from seq;
insert into seq select num+64 from seq;
/* continue doubling the number of records until you feel you have enough */
For the task at hand it is not necessary to add many records, as you only need to make sure you never have more repetitions in your in condition than in the above seq table. I guess 128 will be good enough, but feel free to double the number of records a few times more.
Once you have the above, you can write queries like this:
select province_id,
name,
#pos := instr(#in2 := insert(#in2, #pos+1, 1, '#'),
concat(',',province_id,',')) ord
from (select #in := '0,1,2,3,1,0', #in2 := #in, #pos := 10000) init
inner join provinces
on find_in_set(province_id, #in)
inner join seq
on num <= length(replace(#in, concat(',',province_id,','),
concat(',+',province_id,',')))-length(#in)
order by ord asc
Output for the sample data and sample in list:
| province_id | name | ord |
|-------------|--------|-----|
| 1 | name 1 | 2 |
| 2 | name 2 | 4 |
| 3 | name 3 | 6 |
| 1 | name 1 | 8 |
SQL Fiddle
How it works
You need to put the list of values in the assignment to the variable #in. For it to work, every valid id must be wrapped between commas, so that is why there is a dummy zero at the start and the end.
By joining in the seq table the result set can grow. The number of records joined in from seq for a particular provinces record is equal to the number of occurrences of the corresponding province_id in the list #in.
There is no out-of-the-box function to count the number of such occurrences, so the expression at the right of num <= may look a bit complex. But it just adds a character for every match in #in and checks how much the length grows by that action. That growth is the number of occurrences.
In the select clause the position of the province_id in the #in list is returned and used to order the result set, so it corresponds to the order in the #in list. In fact, the position is taken with reference to #in2, which is a copy of #in, but is allowed to change:
While this #pos is being calculated, the number at the previous found #pos in #in2 is destroyed with a # character, so the same province_id cannot be found again at the same position.
Its unclear exactly what you are wanting, but here's why its not working the way you want. The IN keyword is shorthand for creating a statement like ....Where province_id = 1 OR province_id = 2 OR province_id = 3 OR province_id = 1. Since province_id = 1 is evaluated as true at the beginning of that statement, it doesn't matter that it is included again later, it is already true. This has no bearing on whether the result returns a duplicate.

mysql distribution of combinations/values

I have a mysql table which contains some random combination of numbers. For simplicity take the following table as example:
index|n1|n2|n3
1 1 2 3
2 4 10 32
3 3 10 4
4 35 1 2
5 27 1 3
etc
What I want to find out is the number of times a combination has occured in the table. For instance, how many times has the combination of 4 10 or 1 2 or 1 2 3 or 3 10 4 etc occured.
Do I have to create another table that contains all possible combinations and do comparison from there or is there another way to do this?
For a single combination, this is easy:
SELECT COUNT(*)
FROM my_table
WHERE n1 = 3 AND n2 = 10 AND n3 = 4
If you want to do this with multiple combinations, you could create a (temporary) table of them and join that table with you data, something like this:
CREATE TEMPORARY TABLE combinations (
id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
n1 INTEGER, n2 INTEGER, n3 INTEGER
);
INSERT INTO combinations (n1, n2, n3) VALUES
(1, 2, NULL), (4, 10, NULL), (1, 2, 3), (3, 10, 4);
SELECT c.n1, c.n2, c.n3, COUNT(t.id) AS num
FROM combinations AS c
LEFT JOIN my_table AS t
ON (c.n1 = t.n1 OR c.n1 IS NULL)
AND (c.n2 = t.n2 OR c.n2 IS NULL)
AND (c.n3 = t.n3 OR c.n3 IS NULL)
GROUP BY c.id;
(demo on SQLize)
Note that this query as written is not very efficient due to the OR c.n? IS NULL clauses, which MySQL isn't smart enough to optimize. If all your combinations contain the same number of terms, you can leave those out, which will allow the query to make use of indexes on the data table.
Ps. With the query above, the combination (1, 2, NULL) won't match (35, 1, 2). However, (NULL, 1, 2) will, so, if you want both, a simple workaround would be to just include both patterns in your table of combinations.
If you actually have many more columns than shown in your example, and you want to match patterns that occur in any set of consecutive columns, then your really should pack your columns into a string and use a LIKE or REGEXP query. For example, if you concatenate all your data columns into a comma-separated string in a column named data, you could search it like this:
INSERT INTO combinations (pattern) VALUES
('1,2'), ('4,10'), ('1,2,3'), ('3,10,4'), ('7,8,9');
SELECT c.pattern, COUNT(t.id) AS num
FROM combinations AS c
LEFT JOIN my_table AS t
ON CONCAT(',', t.data, ',') LIKE CONCAT('%,', c.pattern, ',%')
GROUP BY c.id;
(demo on SQLize)
You could make this query somewhat faster by making the prefixes and suffixes added with CONCAT() part of the actual data in the tables, but this is still going to be a fairly inefficient query if you have a lot of data to search, because it cannot make use of indexes. If you need to do this kind of substring searching on large datasets efficiently, you may want to use something better suited for than specific purpose than MySQL.
You only have three columns in the table, so you are looking for combinations of 1, 2, and 3 elements.
For simplicity, I'll start with the following table:
select index, n1 as n from t union all
select index, n2 from t union all
select index, n3 from t union all
select distinct index, -1 from t union all
select distinct index, -2 from t
Let's call this "values". Now, we want to get all triples from this table for a given index. In this case, -1 and -2 represent NULL.
select (case when v1.n < 0 then NULL else v1.n end) as n1,
(case when v2.n < 0 then NULL else v2.n end) as n2,
(case when v3.n < 0 then NULL else v3.n end) as n3,
count(*) as NumOccurrences
from values v1 join
values v2
on v1.n < v2.n and v1.index = v2.index join
values v3
on v2.n < v3.n and v2.index = v3.index
This is using the join mechanism to generate the combinations.
This method finds all combinations regardless of ordering (so 1, 2, 3 is the same as 2, 3, 1). Also, this ignores duplicates, so it cannot find (1, 2, 2) if 2 is repeated twice.
SELECT
CONCAT(CAST(n1 AS VARCHAR(10)),'|',CAST(n2 AS VARCHAR(10)),'|',CAST(n3 AS VARCHAR(10))) AS Combination,
COUNT(CONCAT(CAST(n1 AS VARCHAR(10)),'|',CAST(n2 AS VARCHAR(10)),'|',CAST(n3 AS VARCHAR(10)))) AS Occurrences
FROM
MyTable
GROUP BY
CONCAT(CAST(n1 AS VARCHAR(10)),'|',CAST(n2 AS VARCHAR(10)),'|',CAST(n3 AS VARCHAR(10)))
This creates a single column that represents the combination of the values within the 3 columns by concatenating the values. It will count the occurrences of each.

how find "holes" in auto_increment column?

when I DELETE, as example, the id 3, I have this:
id | name
1 |
2 |
4 |
5 |
...
now, I want to search for the missing id(s), because i want to fill the id again with:
INSERT INTO xx (id,...) VALUES (3,...)
is there a way to search for "holes" in the auto_increment index?
thanks!
You can find the top value of gaps like this:
select t1.id - 1 as missing_id
from mytable t1
left join mytable t2 on t2.id = t1.id - 1
where t2.id is null
The purpose of AUTO_INCREMENT is to generate simple unique and meaningless identifiers for your rows. As soon as you plan to re-use those IDs, they're no longer unique (not at least in time) so I have the impression that you are not using the right tool for the job. If you decide to get rid of AUTO_INCREMENT, you can do all your inserts with the same algorithm.
As about the SQL code, this query will match existing rows with the rows that has the next ID:
SELECT a.foo_id, b.foo_id
FROM foo a
LEFT JOIN foo b ON a.foo_id=b.foo_id-1
E.g.:
1 NULL
4 NULL
10 NULL
12 NULL
17 NULL
19 20
20 NULL
24 25
25 26
26 27
27 NULL
So it's easy to filter out rows and get the first gap:
SELECT MIN(a.foo_id)+1 AS next_id
FROM foo a
LEFT JOIN foo b ON a.foo_id=b.foo_id-1
WHERE b.foo_id IS NULL
Take this as a starting point because it still needs some tweaking:
You need to consider the case where the lowest available number is the lowest possible one.
You need to lock the table to handle concurrent inserts.
In my computer it's slow as hell with big tables.
I think the only way you can do this is with a loop:
Any other solutions wont show gaps bigger than 1:
insert into XX values (1)
insert into XX values (2)
insert into XX values (4)
insert into XX values (5)
insert into XX values (10)
declare #min int
declare #max int
select #min=MIN(ID) from xx
select #max=MAX(ID) from xx
while #min<#max begin
if not exists(select 1 from XX where id = #min+1) BEGIN
print 'GAP: '+ cast(#min +1 as varchar(10))
END
set #min=#min+1
end
result:
GAP: 3
GAP: 6
GAP: 7
GAP: 8
GAP: 9
First, I agree with the comments that you shouldn't try filling in holes. You won't be able to find all the holes with a single SQL statement. You'll have to loop through all possible numbers starting with 1 until you find a hole. You could write a sql function to do this for you that could then be used in a function. So if you wrote a function called find_first_hole you could then call it in an insert like:
INSERT INTO xx (id, ...) VALUES (find_first_hole(), ...)
This is a gaps&island problem, see my (and other) replies here and here. In most cases, gaps&islands problems are most elegantly solved using recursive CTE's, which are not available in mysql.

MySQL String Comparison with Percent Output (Position Very Important

I am trying to compare two entries of 6 numbers, each number which can either can be zero or 1 (i.e 100001 or 011101). If 3 out of 6 match, I want the output to be .5. If 2 out of 6 match, i want the output to be .33 etc.
Note that position matters. A match only occurs when both entries have a 1 in the first position, both have a 0 in the second position etc.
Here are the SQL commands to create the table
CREATE TABLE sim
(sim_key int,
string int);
INSERT INTO sim (sim_key, string)
VALUES (1, 111000);
INSERT INTO sim (sim_key, string)
VALUES (2, 101101);
My desired output to compare the two strings, which share 50% of the characters, and output 50%.
Is it possible to do this sort of comparison in SQL? Thanks in advance
Have a look at this example.
CREATE TABLE sim (sim_key int, string int);
INSERT INTO sim (sim_key, string) VALUES (1, 111000);
INSERT INTO sim (sim_key, string) VALUES (2, 101101);
select a.string A, b.string B,
sum(case when Substring(A.string,Pos,1) = Substring(B.string,Pos,1) then 1 else 0 end) Matches,
count(*) as RowCount,
(sum(case when Substring(A.string,Pos,1) = Substring(B.string,Pos,1) then 1 else 0 end) /
count(*) * 100.0) as PercentMatch
from sim A
cross join sim B
inner join (
select 1 Pos union all select 2 union all select 3
union all select 4 union all select 5 union all select 6) P
on P.Pos between 1 and length(A.string)
where A.sim_key= 1 and B.sim_key = 2
group by a.string, b.string
It is crude and probably included more than required but shows how it can be done. It is better to create a numbers table with just numbers from 1 to 1000 or so, that can be used repeatedly in many queries where a number sequence is required. Such a table will replace the (select .. union virtual table used in the inner join)
Instead of keeping 10010101 as integer convert this binary version to true integer when compare use bit logic AND, result convert to binary and count '1' to how many match...
for convert: http://dev.mysql.com/doc/refman/5.5/en/binary-varbinary.html
for compare: http://dev.mysql.com/doc/refman/5.5/en/bit-functions.html bitwise AND
...