select from unknown number of tables with parameter - mysql

I want to select (union) rows from multiple tables using Parameter
I have table w with two columns:
The column table_name is referring to other tables in my DB, and condition is the 'where' that should be added to the query.
table_name | condition
---------------------
x | y=2
x | r=3
t | y=2
the query should be something like:
select * from x where y=2
union
select * from x where r=3
union
select * from t where y=2
of course that the number of unions is unknown.
Should it be stored procedure? cursor?

One way to get this done. Initial answer was SQL Server syntax. This edit has the MySQL syntax. Make sure your temp table cannot be accessed at the same time. E.g. In MySQL temp tables are unique to the connection. Also add your error checking. In MySQL set the appropriate varchar size for your needs. I used 1024 across the board just for testing purposes.
MySQL syntax
CREATE table test (
id int,
table_name varchar(1024),
where_c varchar(1024)
);
INSERT into test(id, table_name, where_c) values
(1,'x','y=2'),
(2,'x','r=3'),
(3,'t','y=2');
DROP PROCEDURE IF EXISTS generate_sql;
DELIMITER //
CREATE PROCEDURE generate_sql()
BEGIN
DECLARE v_table_name VARCHAR(1024);
DECLARE v_where_c VARCHAR(1024);
DECLARE table_id INT;
DECLARE counter INT;
DECLARE v_SQL varchar(1024);
CREATE TEMPORARY table test_copy
SELECT * FROM test;
SET v_SQL = '';
SET counter = (SELECT COUNT(1) FROM test_copy);
WHILE counter > 0 DO
SELECT id, table_name, where_c
INTO table_id, v_table_name, v_where_c
FROM test_copy LIMIT 1;
SET v_SQL = CONCAT(v_SQL, 'SELECT * FROM ', v_table_name, ' WHERE ', v_where_c);
DELETE FROM test_copy WHERE id=table_id;
SET counter = (SELECT COUNT(1) FROM test_copy);
IF counter > 0 THEN
SET v_SQL = CONCAT(v_SQL,' UNION ');
ELSE
SET v_SQL = v_SQL;
END IF;
END WHILE;
DROP table test_copy;
SELECT v_SQL;
END //
DELIMITER ;
call generate_sql()
SQL Server syntax
CREATE table test (
id int,
table_name varchar(MAX),
condition varchar(MAX)
);
INSERT into test(id, table_name, condition) values
(1,'x','y=2'),
(2,'x','r=3'),
(3,'t','y=2');
SELECT * INTO #temp FROM test;
DECLARE #SQL varchar(MAX);
SET #SQL='';
while exists (select * from #temp)
begin
DECLARE #table_name varchar(MAX);
DECLARE #condition varchar(MAX);
DECLARE #table_id int;
SELECT top 1 #table_id=id, #table_name=table_name, #condition=condition FROM #temp;
SET #SQL += 'SELECT * FROM ' + #table_name + ' WHERE ' + #condition;
delete #temp where id = #table_id;
if exists (select * from #temp) SET #SQL += ' UNION ';
end
SELECT #SQL;
drop table #temp;

Assuming that tables x and t have the same definition and that you want to ignore duplicate results by using UNION rather than UNION ALL, the following should work:
SET #sql = '';
SELECT GROUP_CONCAT(
CONCAT('SELECT * FROM `', `table_name`, '` WHERE ', `condition`)
SEPARATOR ' UNION ') INTO #sql
FROM w;
SET #sql = CONCAT('SELECT * FROM ( ', #sql, ' ) a;');
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
(I've edited your question slightly because you had two different definitions for table x)

Related

Missing values percentage in MYSQL

I want to find the missing values percentage for each column in table name passed in missing_data_perc function parameter.
Delimiter $$
CREATE PROCEDURE missing_data_perc(IN tableName VARCHAR(30))
BEGIN
DECLARE i int default 0;
DECLARE n int default 0;
DECLARE columnName VARCHAR(30);
DECLARE perc int;
SET #tname = tableName;
SET #perc = 0;
DROP TEMPORARY TABLE IF EXISTS missing_data;
Create TEMPORARY TABLE missing_data (
data_name VARCHAR(50),
missing_data_percentage int
);
SELECT count(*) from information_schema.columns where table_name = tableName into n;
SET i=0;
while i < n Do
-- Select column_name from information_schema.columns where table_name = tableName limit i,1 into columnName;
-- Set #Q3 = CONCAT('SELECT 1-count(*)/count(columnName) from ? into ?');
-- PREPARE Stmt3 from #Q3;
-- EXECUTE Stmt3 using #tname, #perc;
-- SET perc = #perc;
SELECT 1-count(*)/count(columnName) from tableName into perc;
INSERT into missing_data VALUES (columnName, perc);
SET i = i+1;
End While;
SELECT * from missing_data;
END; $$
Delimiter ;
CALL missing_data_perc("deliveries");
Below line is giving me error:
SELECT 1-count(*)/count(columnName) from tableName into perc;
But if i change it to below statement then i works,
SELECT 1-count(*)/count(columnName) from deliveries into perc;

MySQL use a wild card in table name

So I have a software that store data on those tables. I know how those tables start but there will be always a suffix to them that's a number which I have no idea to know
example of those table name is "itemid5_4423"
I know there is a table with the name itemid5 but i have no way to know the suffix number
is there a wild card something similar to this logic select * from itemid5_*;
Let's say you have 2 tables like this:
create table itemid5_1111 (id int, itemname varchar(100));
create table itemid5_2222 (id int, itemname varchar(100));
You insert data into them:
insert into itemid5_1111 values (1, 'first table');
insert into itemid5_2222 values (2, 'second table');
Your goal is to get output like this from all itemid5* tables.
+------+--------------+
| id | itemname |
+------+--------------+
| 1 | first table |
| 2 | second table |
+------+--------------+
You can do that by typing:
select * from itemid5_1111
union all select * from itemid5_2222;
But, that's a lot of manual typing. You can make a stored procedure to dynamically query table names starting with itemid5 and then create a SQL dynamically and execute it.
Stored procedure
delimiter $$
drop procedure if exists get_items$$
create procedure get_items()
begin
declare eof boolean default false;
declare mytable varchar(255);
declare first_run boolean default true;
declare tablenames_cursor cursor for
select table_name from information_schema.tables
where table_name like 'itemid%';
declare continue handler for not found
set eof = true;
set #my_query = '';
open tablenames_cursor;
read_loop: loop
fetch tablenames_cursor into mytable;
if eof then
leave read_loop;
end if;
if first_run then
set #my_query = concat('select * from ', mytable);
set first_run = false;
else
set #my_query = concat(#my_query, ' union all ', 'select * from ', mytable);
end if;
end loop;
close tablenames_cursor;
prepare stmt from #my_query;
execute stmt;
deallocate prepare stmt;
end$$
delimiter ;
You call this procedure like so to get your results:
call get_items();
If you created a 3rd table like so:
create table itemid5_3333 (id int, itemname varchar(100));
insert into itemid5_3333 values (3, 'third table');
And then, you called the proc, you'd get
call get_items();
+------+--------------+
| id | itemname |
+------+--------------+
| 1 | first table |
| 2 | second table |
| 3 | third table |
+------+--------------+
i think using the data dictionary to retrieve the result would help, run this.
select * from information_schema.tables where table_name like 'itemid5_% ';
you can choose the columns you want that this query outputs, table_name is one of them you need, like we used it in where clause.
SELECT
REPLACE
(
GROUP_CONCAT(
CONCAT("SELECT * FROM ", `TABLE_NAME`)
),
",",
" UNION ALL "
)
INTO #sq
FROM
information_schema.tables
WHERE
`TABLE_SCHEMA` = "test";
USE
test;
PREPARE
stmt1
FROM
#sq;
EXECUTE
stmt1;
DELIMITER //
CREATE PROCEDURE merge_tables(IN in_sname VARCHAR(64),IN in_tname VARCHAR(64))
READS SQL DATA
BEGIN
DECLARE sname VARCHAR(64);
DECLARE tname VARCHAR(64);
DECLARE cname VARCHAR(64);
DECLARE done INT DEFAULT FALSE;
DECLARE table_cur CURSOR FOR SELECT table_schema, table_name FROM
information_schema.TABLES WHERE table_schema = in_sname AND table_name LIKE
'table%';
DECLARE column_cur CURSOR FOR SELECT `COLUMN_NAME` FROM
`INFORMATION_SCHEMA`.`COLUMNS` where table_schema = in_sname and table_name
= in_tname;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
-- build column list (Using the column list for table listed in second
parameter in PROC Call)
SET #column = '';
OPEN column_cur;
column_cur_loop: LOOP
FETCH column_cur INTO cname;
IF done THEN
-- SET #column := CONCAT(#column, ') ');
LEAVE column_cur_loop;
END IF;
IF #column = '' THEN
SET #column := CONCAT(#column,cname);
ELSE
SET #column := CONCAT(#column,',',cname);
END IF;
END LOOP;
CLOSE column_cur;
-- Build UNION Query for all table starting with table%)
SET done = FALSE;
SET #sql = '';
OPEN table_cur;
table_list_loop: LOOP
FETCH table_cur INTO sname, tname;
IF done THEN
LEAVE table_list_loop;
END IF;
IF #sql = '' THEN
SET #sql := CONCAT('INSERT INTO MERGED_TABLE (', #column , ') SELECT
', #column , ' FROM `', sname, '`.`', tname, '`');
ELSE
SET #sql := CONCAT(#sql, ' UNION ALL SELECT ' , #column , ' FROM `',
sname, '`.`', tname, '`');
END IF;
END LOOP;
CLOSE table_cur;
PREPARE stmt FROM #sql; -- prepare and execute the dynamically
EXECUTE stmt; -- created query.
DEALLOCATE PREPARE stmt;
END //
DELIMITER ;`
call merge_tables(testdb,table1)
testdb is Schema Name where tables reside
table1 is one of the tables which needs to be merged to get column names
table% in the procedure is the prefix of the all the tables that needs to be merged.

SQL - Get databases which contain a specific record

So originally I have this as a test run.
SELECT DISTINCT table_schema FROM information_schema.columns WHERE table_schema LIKE '%or';
I have looked around and found queries to show all the databases that contain a specific table.
However is it possible to have a query to go a step further and do the following:
"Select all those databases that have a particular table in them and that, in that table, have a particular record in a particular column."?
You cannot do what you want with a SQL statement.
But, you can use SQL to generate the statement that you want. The basic statement is:
select "tablename"
from tablename
where columnname = value
limit 1
Note that value may need to have single quotes around it. You can generate this with:
select concat('select "', c.table_name, '" ',
'from ', c.schema_name, '.', c.table_name, ' ',
'where ', c.column_name, ' = ', VALUE, ' '
'limit 1'
)
from information_schema.columns c
where c.table_name = TABLENAME and c.column_name = COLUMN_NAME;
To put all the statements in one long SQL statement, use:
select group_concat(concat('select "', c.table_name, '" as table_name',
'from ', c.schema_name, '.', c.table_name, ' ',
'where ', c.column_name, ' = ', VALUE, ' '
'limit 1'
) SEPARATOR ' union all '
)
from information_schema.columns c
where c.table_name = TABLENAME and c.column_name = COLUMN_NAME;
I would then just copy the resulting SQL statement and run it. If you like, you can add a prepare statement and run it dynamically.
as an example,
I have a table named T1 with columns (C1,C2), and I am searching for the value 'Needle'.
What this store procedure does is search through table names that starts with T and columns that starts with C, then loop through them and finds the value 'Needle'. It then returns the table_Schema,table_name,column_name and how many times the value 'Needle' is found within that column_name,table_name,table_schema combination.
see this sqlFiddle
CREATE PROCEDURE findDatabase(IN in_value varchar(50))
BEGIN
DECLARE bDone INT;
DECLARE _TableSchema VARCHAR(50);
DECLARE _TableName VARCHAR(50);
DECLARE _ColumnName VARCHAR(50);
DECLARE curs CURSOR FOR SELECT TABLE_SCHEMA,TABLE_NAME,COLUMN_NAME FROM information_schema.columns WHERE TABLE_NAME LIKE "T%" AND COLUMN_NAME LIKE "C%";
DECLARE CONTINUE HANDLER FOR NOT FOUND SET bDone = 1;
DROP TEMPORARY TABLE IF EXISTS tblResults;
CREATE TEMPORARY TABLE IF NOT EXISTS tblResults (
id int auto_increment primary key,
tableSchema varchar(50),
tablename varchar(50),
columnname varchar(50),
timesFound int
);
OPEN curs;
SET bDone = 0;
REPEAT
FETCH curs INTO _TableSchema,_TableName,_ColumnName;
SET #found = 0;
SET #sql = CONCAT("SET #found = (SELECT COUNT(*) FROM ",_TableSchema,".",_TableName,
" WHERE ",_ColumnName,"='",in_value,"')");
PREPARE statement FROM #sql;
EXECUTE statement;
IF (#found > 0) THEN
INSERT INTO tblResults(tableSchema,tableName,columnName,TimesFound) VALUES (_TableSchema,_TableName,_ColumnName,#found);
END IF;
UNTIL bDone END REPEAT;
CLOSE curs;
SELECT DISTINCT TableSchema,TableName,ColumnName,TimesFound FROM tblResults;
DROP TABLE tblResults;
END//

Procedure UNION on tables

How can I UNION all results from stmtQuery to ONE RESULTS example results from table basia and Comments_11 .... etc
DELIMITER $$
DROP PROCEDURE IF EXISTS SearchUserY $$
CREATE PROCEDURE `SearchUserY`(IN UserIdValue INT(11) )
BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE tableName VARCHAR(50);
DECLARE stmtFields TEXT ;
DECLARE columnName VARCHAR(50) default 'UserId';
DECLARE cursor1 CURSOR FOR
SELECT table_name
FROM information_schema.COLUMNS
WHERE table_schema = 'comments'
AND column_name LIKE '%UserId';
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
OPEN cursor1;
read_loop: LOOP
FETCH cursor1 INTO tableName;
IF done THEN
LEAVE read_loop;
END IF;
SET stmtFields = CONCAT('`',tableName,'`','.' , columnName ,'=', UserIdValue) ;
SET #stmtQuery=CONCAT(#sql,'SELECT Nick, Title, Content FROM ' ,'`',tableName,'`', ' WHERE ', stmtFields ) ;
select #stmtQuery;
END LOOP;
PREPARE stmt FROM #stmtQuery ;
EXECUTE stmt ;
DEALLOCATE PREPARE stmt;
CLOSE cursor1;
END
results example (select #stmtQuery):
SELECT Nick, Title, Content FROM `basia` WHERE `basia`.UserId=0
SELECT Nick, Title, Content FROM `Comments_11` WHERE `Comments_11`.UserId=0
... etc
I want get a one results from all this query but know I got only One results
Generate query in a loop using CONCAT function, add 'UNION' or 'UNION ALL' clause between them, then execute result query with a prepared statements.
Solution without cursor:
SET #resultQuery = NULL;
SELECT
GROUP_CONCAT(
DISTINCT
CONCAT('SELECT Nick, Title, Content FROM ', table_name, ' WHERE UserId = ', UserIdValue)
SEPARATOR '\r\nUNION\r\n'
)
INTO
#resultQuery
FROM
information_schema.COLUMNS
WHERE
table_schema = 'comments' AND column_name LIKE '%UserId';
SELECT #resultQuery;
It will produce result like this:
SELECT Nick, Title, Content FROM table1 WHERE UserId = 10
UNION
SELECT Nick, Title, Content FROM table2 WHERE UserId = 10
UNION
SELECT Nick, Title, Content FROM table3 WHERE UserId = 10
UNION
SELECT Nick, Title, Content FROM table4 WHERE UserId = 10
...
Increase group_concat_max_len variable if needed. It is the maximum allowed result length for the GROUP_CONCAT() function, default value = 1024.

Select N random records per group

Hallo and good sunday to everybody.
I need to select N random records from each group.
Starting from the query of Quassnoi
http://explainextended.com/2009/03/01/selecting-random-rows/
to select X random record I wrote this store procedure
delimiter //
drop procedure if exists casualiPerGruppo //
create procedure casualiPerGruppo(in tabella varchar(50),in campo varchar(50),in numPerGruppo int)
comment 'Selezione di N record casuali per gruppo'
begin
declare elenco_campi varchar(255);
declare valore int;
declare finite int default 0;
declare query1 varchar(250);
declare query2 varchar(250);
declare query3 varchar(250);
declare query4 varchar(250);
declare cur_gruppi cursor for select gruppo from tmp_view;
declare continue handler for not found set finite = 1;
drop table if exists tmp_casuali;
set #query1 = concat('create temporary table tmp_casuali like ', tabella);
prepare stmt from #query1;
execute stmt;
deallocate prepare stmt;
set #query2 = concat('create or replace view tmp_view as select ',campo,' as gruppo from ',tabella,' group by ',campo);
prepare stmt from #query2;
execute stmt;
deallocate prepare stmt;
open cur_gruppi;
mio_loop:loop
fetch cur_gruppi into valore;
if finite = 1 then
leave mio_loop;
end if;
set #query3 = concat("select group_concat(column_name) into #elenco_campi
from information_schema.columns
where table_name = '",tabella,"' and table_schema = database()");
prepare stmt from #query3;
execute stmt;
deallocate prepare stmt;
set #query4 = concat('insert into tmp_casuali select ',
#elenco_campi,' from (
select #cnt := count(*) + 1,
#lim :=', numPerGruppo,
' from ',tabella,
' where ',campo,' = ', valore,
' ) vars
straight_join
(
select r.*,
#lim := #lim - 1
from ', tabella, ' r
where (#cnt := #cnt - 1)
and rand() < #lim / #cnt and ', campo, ' = ', valore ,
') i');
prepare stmt from #query4;
execute stmt;
deallocate prepare stmt;
end loop;
close cur_gruppi;
select * from tmp_casuali;
end //
delimiter ;
that I recall in this way to give you an idea:
create table prova (
id int not null auto_increment primary key,
id_gruppo int,
altro varchar(10)
) engine = myisam;
insert into prova (id_gruppo,altro) values
(1,'aaa'),(2,'bbb'),(3,'ccc'),(1,'ddd'),(1,'eee'),(2,'fff'),
(2,'ggg'),(2,'hhh'),(3,'iii'),(3,'jjj'),(3,'kkk'),(1,'lll'),(4,'mmm');
call casualiPerGruppo('prova','id_gruppo',2);
My problem is that Quassnoi query, even though is very performant, it takes even 1 second on a large recorset. So if I apply it within my sp several times, the total time increases a lot.
Can you suggest me a better way to solve my problem?
Thanks in advance
EDIT.
create table `prova` (
`id` int(11) not null auto_increment,
`id_gruppo` int(11) default null,
`prog` int(11) default null,
primary key (`id`)
) engine=myisam charset=latin1;
delimiter //
drop procedure if exists inserisci //
create procedure inserisci(in quanti int)
begin
declare i int default 0;
while i < quanti do
insert into prova (id_gruppo,prog) values (
(floor(1 + (rand() * 100))),
(floor(1 + (rand() * 30)))
);
set i = i + 1;
end while;
end //
delimiter ;
call inserisci(1000000);
#Clodoaldo:
My stored procedure
call casualipergruppo('prova','id_gruppo',2);
gives me 200 records and takes about 23 seconds. Your stored procedure keeps on giving me Error Code : 1473 Too high level of nesting for select even though I increase varchar value to 20000. I don't know if there is any limit on unions involved in a query.
I removed the tabella and campo parameters from the procedure just to make it easier to understand. I'm sure you can bring them back.
delimiter //
drop procedure if exists casualiPerGruppo //
create procedure casualiPerGruppo(in numPerGruppo int)
begin
declare valore int;
declare finite int default 0;
declare query_part varchar(200);
declare query_union varchar(2000);
declare cur_gruppi cursor for select distinct id_gruppo from prova;
declare continue handler for not found set finite = 1;
create temporary table resultset (id int, id_gruppo int, altro varchar(10));
set #query_part = 'select id, id_gruppo, altro from (select id, id_gruppo, altro from prova where id_gruppo = #id_gruppo order by rand() limit #numPerGruppo) ss#id_gruppo';
set #query_part = replace(#query_part, '#numPerGruppo', numPerGruppo);
set #query_union = '';
open cur_gruppi;
mio_loop:loop
fetch cur_gruppi into valore;
if finite = 1 then
leave mio_loop;
end if;
set #query_union = concat(#query_union, concat(' union ', #query_part));
set #query_union = replace(#query_union, '#id_gruppo', valore);
end loop;
close cur_gruppi;
set #query_union = substr(#query_union, 8);
set #query_union = concat('insert into resultset ', #query_union);
prepare stmt from #query_union;
execute stmt;
deallocate prepare stmt;
select * from resultset order by id_gruppo, altro;
drop table resultset;
end //
delimiter ;
Wow. That's a complicated way to do something very simple. Try this:
Assuming you have sequential ids (otherwise you could get no rows).
create view random_prova as
select * from prova
where id = (select min(id) from prova) +
floor(RAND(0) * (select max(id) - min(id) from prova));
This will give you 1 random row.
To get multiple rows, either loop in a stored procedure or server program until you get enough rows, or programatically create a query that employs union.
eg, this will give you 3 random rows:
select * from random_prova
union
select * from random_prova
union
select * from random_prova;
Note that using RAND(0) instead of RAND() means getting a different random number for each invocation. RAND() will give the same value for each invocation in the one statement (so using RAND() with union won't give you multiple rows).
There are some shortcomings with using union - it is possible to get the same row twice by chance. Programatically calling this until you get enough rows is safer.
To give better performance, use something like java to randomly select the ids for a simple query like
select * from prova where id in (...)
and have java (or perl or whatever) fill in the list with random ids - you would avoid the inefficiency of having to get the id range every time.
Post if your ids are not sequential - there is an efficient way, but I its explanation is long.