I am wondering if it is possible to use SQL to create a table that name columns by index(number). Say, I would like to create a table with 10 million or so columns, I definitely don't want to name every column...
I know that I can write a script to generate a long string as SQL command. However, I would like to know if there is a more elegant way to so
Like something I make up here:
CREATE TABLE table_name
(
number_columns 10000000,
data_type INT
)
I guess saying 10 million columns caused a lot of confusion. Sorry about that. I looked up the manual of several major commercial DBMS and seems it is not possible. Thank you for pointing this out.
But another question, which is most important, does SQL support numerical naming of columns, say all the columns have the same type and there is 50 columns. And when referring it, just like
SELECT COL.INDEX(3), COL.INDEX(2) FROM MYTABLE
Does the language support that?
Couldn't resist looking into this, and found that the MySQL Docs say "no" to this, that
There is a hard limit of 4096 columns per table, but the effective
maximum may be less for a given table
You can easily do that in Postgres with dynamic SQL. Consider the demo:
DO LANGUAGE plpgsql
$$
BEGIN
EXECUTE '
CREATE TEMP TABLE t ('
|| (
SELECT string_agg('col' || g || ' int', ', ')
FROM generate_series(1, 10) g -- or 1600?
)
|| ')';
END;
$$;
But why would you even want to give life to such a monstrosity?
As #A.H. commented, there is a hard limit on the number of columns in PostgreSQL:
There is a limit on how many columns a table can contain. Depending on
the column types, it is between 250 and 1600. However, defining a
table with anywhere near this many columns is highly unusual and often
a questionable design.
Emphasis mine.
More about table limitations in the Postgres Wiki.
Access columns by index number
As to your additional question: with a schema like the above you can simply write:
SELECT col3, col2 FROM t;
I don't know of a built-in way to reference columns by index. You can use dynamic SQL again. Or, for a table that consists of integer columns exclusively, this will work, too:
SELECT c[3] AS col3, c[2] AS col2
FROM (
SELECT translate(t::text, '()', '{}')::int[] AS c -- transform row to ARRAY
FROM t
) x
Generally when working with databases your schema should be more or less "defined" so dynamic column adding isn't a built in functionality.
You can, however, run a loop and continually ALTER TABLE to add columns like so:
BEGIN
SET #col_index = 0;
start_loop: LOOP
SET #col_index = #col_index + 1;
IF #col_index <= num_columns THEN
SET #alter_query = (SELECT CONCAT('ALTER TABLE table_name ADD COLUMN added_column_',#col_index,' VARCHAR(50)'));
PREPARE stmt FROM #alter_query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
ITERATE start_loop;
END IF;
LEAVE start_loop;
END LOOP start_loop;
END;
But again, like most of the advice you have been given, if you think you need that many columns, you probably need to take a look at your database design, I have personally never heard of a case that would need that.
Note: As mentioned by #GDP you can have only 4096 cols and definitely the idea is not appreciated and as again #GDP said that database design ideas need to be explored to consider if something else could be a better way to handle this requirement.
However, I was just wondering apart from the absurd requirement if ever I need to do this how can I do it? I thought why not create a custom / user defined MySQL function e.g. create_table() tht will receive the parameters you intend to send and which will in turn generate the required CREATE TABLE command.
This is an option for finding columns using ordinal values. It might not be the most elegant or efficient but it works. I am using it to create a new table for faster mappings between data that I need to parse through all the columns / rows.
DECLARE #sqlCommand varchar(1000)
DECLARE #columnNames TABLE (colName varchar(64), colIndex int)
DECLARE #TableName varchar(64) = 'YOURTABLE' --Table Name
DECLARE #rowNumber int = 2 -- y axis
DECLARE #colNumber int = 24 -- x axis
DECLARE #myColumnToOrderBy varchar(64) = 'ID' --use primary key
--Store column names in a temp table
INSERT INTO #columnNames (colName, colIndex)
SELECT COL.name AS ColumnName, ROW_NUMBER() OVER (ORDER BY (SELECT 1))
FROM sys.tables AS TAB
INNER JOIN sys.columns AS COL ON COL.object_id = TAB.object_id
WHERE TAB.name = #TableName
ORDER BY COL.column_id;
DECLARE #colName varchar(64)
SELECT #colName = colName FROM #columnNames WHERE colIndex = #colNumber
--Create Dynamic Query to retrieve the x,y coordinates from table
SET #sqlCommand = 'SELECT ' + #colName + ' FROM (SELECT ' + #colName + ', ROW_NUMBER() OVER (ORDER BY ' + #myColumnToOrderBy+ ') AS RowNum FROM ' + #tableName + ') t2 WHERE RowNum = ' + CAST(#rowNumber AS varchar(5))
EXEC(#sqlCommand)
Related
I have two databases that I would like to merge, the problem is it has around 20 tables that are relevant and have unique object id's that are linked into each other. in example Table names:
name
object_id
FirstName
500
and then it has tables like Items:
item_name
object_id
item_id
itemNr1
500
400
the third table would be Items_specialty:
specialty
item_id
specialty_id
power1
400
600
as you see they all are tied together name's object id is attached to item and item id is attached to specialty_id.
however in two databases object_id, item_id and specialty_id are duplicating, and when I'm talking about nearly 100,000 rows it get's complicated and concern of loosing object id's is high as if that would happen different names would have different items etc. so what would be the best way to merge it while maintaining all object id's to specific name and follow the trail through the tables updating them all together?
Ideal solution would be check whether object_id+1 is not used and if not apply it and then do the same in all further tables, doing the same for item_id and specialty_id, where at the end same name would hold same item and specialty with it.
Really appreciate any tips or possible solutions to explore, was searching the internet far and wide but without having to pay thousands for a tools can't seem to find a solution that would fit my issue, as usually people only got to merge couple tables instead of many like mine.
Thank you in advance.
(The queries used in this method are for sql server. You need similar queries in mysql to run correctly.)
Database merge requires a thorough understanding of the data and the design of the database.
For example, the following solution can be used to merge these two databases:
1- With the following command, you can remove all the restrictions of the second database in sql server:
use db2;
DECLARE #sql NVARCHAR(MAX);
SET #sql = N'';
SELECT #sql = #sql + N'
ALTER TABLE ' + QUOTENAME(s.name) + N'.'
+ QUOTENAME(t.name) + N' DROP CONSTRAINT '
+ QUOTENAME(c.name) + ';'
FROM sys.objects AS c
INNER JOIN sys.tables AS t
ON c.parent_object_id = t.[object_id]
INNER JOIN sys.schemas AS s
ON t.[schema_id] = s.[schema_id]
WHERE c.[type] IN ('D','C','F','PK','UQ')
ORDER BY c.[type];
--PRINT #sql;
EXEC sys.sp_executesql #sql;
2- In the first step, we start merge the tables that do not have an external key with the first database, and at the same time enter the data into the tables of the first database, with the newly generated values (I assume that your tables have identity), We update the related tables in the second database. For example, look at the following command:
declare #I int = 0
declare #count_t1 int = (select count(*) from table1)
DECLARE #LAST_object_id INT = 0;
DECLARE #OLD_object_id INT = 0;
WHILE #I < #count_t1
BEGIN
SET #LAST_object_id = 0;
SET #OLD_object_id = 0;
INSERT INTO db1.dbo.table1
(
name
)
SELECT
name
FROM db2.dbo.table1
ORDER BY db2.dbo.table1.object_id
OFFSET #I ROWS FETCH FIRST 1 ROWS ONLY
SET #LAST_object_id = (SELECT TOP(1) object_id FROM db1.dbo.table1 ORDER BY db1.dbo.table1.object_id DESC)
SET #OLD_object_id = (SELECT object_id FROM db2.dbo.table1 ORDER BY db2.dbo.table1.object_id OFFSET #I ROWS FETCH FIRST 1 ROWS ONLY)
UPDATE db2.dbo.table2
SET object_id = #LAST_object_id
WHERE object_id = #OLD_object_id
SET #I = #I + 1
END
3- In this step, we will merge the tables with the first database that have foreign keys, but we know that their foreign keys have been updated in step 2.
4- Repeat step 3 for the depth of the database to reach tables whose primary key is not an foreign key for other tables. Then we merge those tables with the first database.
Remember: If the values of the first database tables are less than the values of the second database tables, the probability of error in that method increases. So you need to control how identity grows in sql server with the following command:
DECLARE #MAX_0_object_id INT;
DECLARE #MAX_1_object_id INT;
SET #MAX_0_object_id = IDENT_CURRENT('db1.dbo.table1')
SET #MAX_1_object_id = IDENT_CURRENT('db2.dbo.table1')
IF #MAX_1_object_id > #MAX_0_object_id
BEGIN
DBCC CHECKIDENT ('db1.dbo.table1', RESEED, #MAX_1_object_id)
END
Inside procedure, I want to create a temporary table "report" with column names of another table "descriptions" rows contents, but I get error, because my query instead of using variable "tmp_description" value, uses its name to create a new column. How to use variable value as name for the new column?
DECLARE n INT DEFAULT 0;
DECLARE i INT DEFAULT 0;
DECLARE tmp_description varchar(30);
...
CREATE TEMPORARY TABLE descriptions (description varchar(30));
insert into descriptions
select distinct description from pure;
SELECT COUNT(*) INTO n FROM descriptions;
SET i=0;
WHILE i<n DO
SELECT * INTO tmp_description FROM (SELECT * FROM descriptions LIMIT i,1) t1;
ALTER TABLE report
ADD COLUMN
tmp_description FLOAT(2) DEFAULT 0.0; <-- I get error here
SET i = i + 1;
END WHILE;
I don't see any value to doing this in a while loop. Your looping mechanism is all off anyway, because you are using LIMIT without ORDER BY -- which means that the row returned on each iteration is arbitrary.
Why not just construct a single statement? First run:
select group_concat('add column ', description, ' numeric(2)' separator ', ') as columns
from t;
Note that float(2) doesn't really make sense to me as a data type. I suspect that you really want a numeric/decimal type.
Then take the results. Prepend them with alter table report and run the code.
You could do this using dynamic SQL, but I see no advantage to doing that.
I have a very large database and for testing, I want to set a certain amount of data to NULL.
As an example, I have 57 columns across 3 tables, all of which need to be nullified. I can't delete the rows, I just need to know that if the row exists and there's no data in those fields, that everything still works.
To clarify, all the data in those fields has been moved to anther table, and the old data was not wiped in the migration. To test my reports I need to know that the reports are pulling from the new location, not the old, since as new data is added, it will only go to the new location. Our plan is to generate each report from the old database, migrate, and then generate them again and compare. But to ensure that they are pulling from the right place, we want to wipe the old data so it doesn't provide a false positive.
Is there a way for me to do this in bulk or should I resign myself to writing one comma separated SET statement after another?
You can create the statements using the data from the internal information_schema.COLUMNS table.
Assuming you have this table:
CREATE TABLE my_table (
keep1 INT,
keep2 INT,
set_null1 INT,
set_null2 INT,
set_null3 INT
);
and you want to set all columns to NULL except of keep1 and keep2. Execute the following script:
set #db_name = 'test';
set #table_name = 'my_table';
set #exclude_columns = 'keep1,keep2';
select concat(
'UPDATE `', #table_name, '` SET\n',
group_concat('`', COLUMN_NAME, '` = NULL' separator ',\n'),
';'
)
from information_schema.COLUMNS c
where c.TABLE_SCHEMA = #db_name
and c.TABLE_NAME = #table_name
and find_in_set(c.COLUMN_NAME, #exclude_columns) = 0;
This will generate the following statement:
UPDATE `my_table` SET
`set_null1` = NULL,
`set_null2` = NULL,
`set_null3` = NULL;
Copy the result and paste it into your UPDATE script. Do it for all 12 tables adjusting the variables #db_name, #table_name and #exclude_columns.
See demo on db-fiddle.
This is a very unusual task for an SQL database, so it's not surprising that it's a bit awkward.
As you know, to set multiple columns to NULL in an UPDATE statement, you'd have to set each column individually.
UPDATE mytable
SET col1 = NULL, col2 = NULL, ... col57 = NULL
WHERE id = ?;
That could be quite a bit of typing. Or it could be a task to write code to loop over the column names in your table, and concatenate the terms for UPDATE statement. Up to you.
An alternative that might be easier is to delete the row and then re-insert it with no values specified except the primary key.
DELETE FROM mytable WHERE id = ?;
INSERT INTO mytable SET id = ?;
By omitting the other columns, they'll be NULL or else take a DEFAULT value defined in your table. If you want those columns with defaults to be NULL too, you'll have to specify that.
INSERT INTO mytable SET id = ?, col23 = NULL;
I have a large database (~50,000 rows) with 20 columns, and I want to "split" the data based upon the values in the third column (called FEATURE_CLASS). The values of FEATURE_CLASS are all of type VARCHAR(), and I want to create however many tables I'd need to replace the single, large table with many smaller tables each entitled with whatever the original table's FEATURE_CLASS value was.
Not sure of the best way to go about this, I was thinking something along the lines of creating a temporary table which would serve as an index, each row carrying a unique value of FEATURE_CLASS, then to iterate over the temp table and perform copying operations for each row of the temp table. I'm not sure really where to go from here, any help/ideas would be appreciated. Thanks!
The basic idea is to create a result set of statements that will create the tables for you and populate with the correct data. Run the below to generate the statements. You can then execute these statements manually (copy/paste) or from within a script.
SQL Server Example
SELECT DISTINCT
'SELECT * INTO [' + TableName.FEATURE_CLASS + '] FROM TableName WHERE FEATURE_CLASS = ''' + TableName.FEATURE_CLASS + ''';'
FROM
TableName
If you have any special characters in the FEATURE_CLASS column, you might want to consider removing them in the script above to prevent table names that are either invalid or tough to work with.
For example:
...
'SELECT * INTO [' + REPLACE(TableName.FEATURE_CLASS, '.', '') + '] FROM TableName WHERE FEATURE_CLASS = ''' + TableName.FEATURE_CLASS + ''';'
...
MySQL Example
SELECT DISTINCT
CONCAT('CREATE TABLE `', DB1.FEATURE_CLASS,
'` AS SELECT * FROM DB1 WHERE FEATURE_CLASS = ''',
DB1.FEATURE_CLASS, ''';') AS statements
FROM DB1;
This will give you a MySQL command something like this:
CREATE TABLE `feature_class_value` AS
SELECT * FROM DB1
WHERE FEATURE_CLASS = 'feature_class_value';
Check out the MySQL docs for more info on CREATE TABLE SELECT options https://dev.mysql.com/doc/refman/5.7/en/create-table-select.html.
I need to copy/ insert all values with a certain where clause from table A to table B (basically from Main tables to respective history tables).
I don't want to specify the column names as I want to create a generic approach which will be able to use for all the tables that will need the ingestion.
Unfortunately, the attributes in table A are not always in the same order as it is in tableB, so I can't use select * into #temp from TableA and then insert into tableB from #temp. Plus TableB has generic 3 sys columns which we are generated for audit purposes.
My idea was to use the Info schema to get the column names. Then somehow use the result to get all the values from the asking table and add on top the generic sys columns. Is it possible to do?
I got the column names by using Info schema.
Select
COLUMN_NAME
FROM INFORMATION_SCHEMA.columns
where TABLE_NAME = 'TableA'
The SYS columns are:
sys_date=Getdate ()
,sys_flag='1'
,sys_name=SYSTEM_USER
Use Dynamic Query SQL Server
Please change table Name accordingly.
Declare #Table_Name varchar(50)
SET #Table_Name ='LoginMst'
Declare #Query varchar(8000)
Declare #ColumnNames varchar (8000)
set #ColumnNames = ''
select #ColumnNames =
case when #ColumnNames = ''
then column_name
else #ColumnNames + coalesce(',' + column_name, '')
end
from information_schema.columns where Table_Name=#Table_Name
SET #Query='insert into '+#Table_Name+'_Log ('+#ColumnNames+',sys_date,sys_flag,sys_name'+')
select '+#ColumnNames+',Getdate(),''1'',SYSTEM_USER from '+ 'LoginMst'
--print #Query
Exec(#Query)
You will require iterating all the tables you wish to take backup of. You will require adding the where clause too.