Duplicate records of a MySQL table

Duplicate records of a MySQL table - mysql

I have a table that has approximately 4 million records. I would like to make it have 240 million like so:
Add an additional column of type BIGINT,
Import 59 times the data I already have,
And for each 4 million group of records, have the additional column to have a different value
The value of the additional column would come from another table.
So I have these records (except that I have 4 millions of them and not just 3):
| id | value |
+----+-------+
| 1 | 123 |
| 2 | 456 |
| 3 | 789 |
And I want to achieve this (except that I want 60 copies and not just 3):
| id | value | data |
+----+-------+------+
| 1 | 123 | 1 |
| 2 | 456 | 1 |
| 3 | 789 | 1 |
| 4 | 123 | 2 |
| 5 | 456 | 2 |
| 6 | 789 | 2 |
| 7 | 123 | 3 |
| 8 | 456 | 3 |
| 9 | 789 | 3 |
I tried to export my data (using SELECT .. INTO OUTFILE ...), then re-import it (using LOAD DATA INFILE ...) but it is really painfully slow.
Is there a fast way to do this?
Thank you!

Sounds like you'd like to take the cartesian product of 2 tables and create a new table since you say The value of the additional column would come from another table? If so, something like this should work:
create table yourtable (id int, value int);
create table yournewtable (id int, value int, data int);
create table anothertable (data int);
insert into yourtable values (1, 123), (2, 456), (3, 789);
insert into anothertable values (1), (2), (3);
insert into yournewtable
select t.id, t.value, a.data
from yourtable t, anothertable a
SQL Fiddle Demo
Results:
ID VALUE DATA
1 123 1
2 456 1
3 789 1
1 123 2
2 456 2
3 789 2
1 123 3
2 456 3
3 789 3
Edit, Side Note -- it looks like your ID field in your new table is not suppose to keep repeating the same ids? If so, you can use an AUTO_INCREMENT field instead. However, this could mess up the original rows if they aren't sequential.

First, I would recommend that you create a new table. You can do this using a cross join:
create table WayBigTable as
select t.*, n
from table t cross join
(select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all
. . .
select 60
) n;
I'm not sure why you would want a bigint for this column. If you really need that, you can cast to unsigned.

Hmm. You need a cross join of your table with a range. Something in a line of this:
INSERT INTO table (id,value,data) SELECT id, value from table
CROSS JOIN (SELECT 2 UNION SELECT 3 UNION ... SELECT 60) AS data;
Use this answer Generating a range of numbers in MySQL as reference on number range.

Here's one idea...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,value INT NOT NULL
);
INSERT INTO my_table VALUES
(1 ,123),
(2 ,456),
(3 ,789);
ALTER TABLE my_table ADD COLUMN data INT NOT NULL DEFAULT 1;
SELECT * FROM my_table;
+----+-------+------+
| id | value | data |
+----+-------+------+
| 1 | 123 | 1 |
| 2 | 456 | 1 |
| 3 | 789 | 1 |
+----+-------+------+
SELECT * FROM ints;
+---+
| i |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+---+
INSERT INTO my_table SELECT NULL,value,data+i2.i*10+i1.i+1 FROM my_table,ints i1,ints i2;
SELECT * FROM my_table;
+-----+-------+------+
| id | value | data |
+-----+-------+------+
| 1 | 123 | 1 |
| 2 | 456 | 1 |
| 3 | 789 | 1 |
| 4 | 123 | 2 |
| 5 | 456 | 2 |
| 6 | 789 | 2 |
| 7 | 123 | 3 |
| 8 | 456 | 3 |
...
...
| 296 | 456 | 97 |
| 297 | 789 | 97 |
| 298 | 123 | 98 |
| 299 | 456 | 98 |
| 300 | 789 | 98 |
| 301 | 123 | 99 |
| 302 | 456 | 99 |
| 303 | 789 | 99 |
+-----+-------+------+
303 rows in set (0.00 sec)
Note, for 240 million rows, this is still going to be a bit slow :-(

Related

Creating a HIVE table that filters data from a .csv in HDFS based on the value in a column

I currently have a file which contains data in it that needs to populate 9 different tables. Each of these tables has a different number of columns and datatypes, therefore I need to filter the source file (using the first column which determines which table the row will go into).
My current method is to create a table that has generic columns names col_1, col_2 etc up to the last filled column in the file and then create 9 views that reference this file. The issue I have is that there are a different data types appearing in the same columns due to the fact the tables are all different structures.
Is there a possibility to create a dynamic schema that filters the .csv that the HIVE table points to base on the first column??
thanks

Demo
data.csv
1,1,Now,11,22,2016-12-12
1,2,I,33,44,2017-01-01
3,3,heard,55,66,2017-02-02
1,4,you,77,88,2017-03-03
2,5,know,99,1010,2017-04-04
1,6,that,1111,1212,2017-05-05
2,7,secret,1313,1414,2017-06-06
create external table mycsv
(
rec_type int
,id int
,mystring string
,myint1 int
,myint2 int
,mydate date
)
row format delimited
fields terminated by ','
stored as textfile
;
select * from mycsv;
+----------+----+----------+--------+--------+------------+
| rec_type | id | mystring | myint1 | myint2 | mydate |
+----------+----+----------+--------+--------+------------+
| 1 | 1 | Now | 11 | 22 | 2016-12-12 |
| 1 | 2 | I | 33 | 44 | 2017-01-01 |
| 3 | 3 | heard | 55 | 66 | 2017-02-02 |
| 1 | 4 | you | 77 | 88 | 2017-03-03 |
| 2 | 5 | know | 99 | 1010 | 2017-04-04 |
| 1 | 6 | that | 1111 | 1212 | 2017-05-05 |
| 2 | 7 | secret | 1313 | 1414 | 2017-06-06 |
+----------+----+----------+--------+--------+------------+
create table t1(id int,mystring string);
create table t2(id int,mystring string,mydate date);
create table t3(id int,mydate date,myint1 int,myint2 int);
from mycsv
insert into t1 select id,mystring where rec_type = 1
insert into t2 select id,mystring,mydate where rec_type = 2
insert into t3 select id,mydate,myint1,myint2 where rec_type = 3
select * from t1;
+----+----------+
| id | mystring |
+----+----------+
| 1 | Now |
| 2 | I |
| 4 | you |
| 6 | that |
+----+----------+
select * from t2;
+----+----------+------------+
| id | mystring | mydate |
+----+----------+------------+
| 5 | know | 2017-04-04 |
| 7 | secret | 2017-06-06 |
+----+----------+------------+
select * from t3;
+----+------------+--------+--------+
| id | mydate | myint1 | myint2 |
+----+------------+--------+--------+
| 3 | 2017-02-02 | 55 | 66 |
+----+------------+--------+--------+

How can I select the next available value in a column taking into account another column in SQL?

I have the following SQL table:
bid btype Name world vi
---|----|------------ |--------|---------
1 | 1 | Business 1 | 0 | 44
2 | 4 | Business 2 | 0 | 55
5 | 5 | Business 3 | 0 | 23
3 | 1 | Business 4 | 1 | 99
4 | 2 | Business 5 | 0 | 12
6 | 3 | Business 6 | 0 | 14
7 | 2 | Business 7 | 1 | 55
8 | 1 | Business 8 | 2 | 66
9 | 2 | Business 9 | 2 | 77
10 | 1 | Business 10 | 3 | 88
What I want is to gradually increase the value in the "world" column according to its "btype", for example every single row starts with a value 0 in the "world column
as it's the first time such value in the "btype" column is inserted, what I want is to check if there's already a "btype" inserted so that the "world" column no longer will assume a value of 0 but 1 and so on...
What I want to achieve is that there can't be two rows sharing the same "btype" with the same "world", the "btype" can be the same but not the "world", it has to be different and I want
it to gradually increase.
How would I approach to do such thing?

E.g.:
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(bid INT NOT NULL PRIMARY KEY
,btype INT NOT NULL
);
INSERT INTO my_table VALUES
( 1,1),
( 2,4),
( 5,5),
( 3,1),
( 4,2),
( 6,3),
( 7,2),
( 8,1),
( 9,2),
(10,1);
SELECT bid
, btype
, i
FROM
( SELECT x.*
, CASE WHEN #prev=btype THEN #i:=#i+1 ELSE #i:=0 END i
, #prev:=btype prev
FROM my_table x
,( SELECT #i:=0,#prev:=null) vars
ORDER
BY btype,bid
) n
ORDER
BY bid;
+-----+-------+------+
| bid | btype | i |
+-----+-------+------+
| 1 | 1 | 0 |
| 2 | 4 | 0 |
| 3 | 1 | 1 |
| 4 | 2 | 0 |
| 5 | 5 | 0 |
| 6 | 3 | 0 |
| 7 | 2 | 1 |
| 8 | 1 | 2 |
| 9 | 2 | 2 |
| 10 | 1 | 3 |
+-----+-------+------+

I don't usually do queries like this, so I can't say for certain, but something like this should maybe do the trick.
INSERT INTO theTable(btype, Name, world, vi)
SELECT [val1]
, [val2]
, IFNULL((SELECT MAX(world) FROM theTable WHERE btype = [val1]),-1)+1
, [val3]
;
You might even be able to just include the 3rd select expression in a conventional INSERT...VALUES value list; but like I said, I don't usually do queries like this. (I'm of the apparent minority that likes to check first before inserting; but not as a replacement for an appropriate uniqueness constraint.)

How to search in multiple columns?

Here is my table:
// table
+----+------+------+
| id | col1 | col2 |
+----+------+------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 1 |
| 5 | 2 | 2 |
| 6 | 3 | 1 |
| 7 | 3 | 2 |
| 8 | 3 | 3 |
| 9 | 3 | 4 |
| 10 | 3 | 5 |
+----+------+------+
Now I want to search in both col1 and col2. Something like this:
select * from table where col1,col2 IN (1,2);
And I want this output:
+----+------+------+
| id | col1 | col2 |
+----+------+------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 1 |
| 5 | 2 | 2 |
| 6 | 3 | 1 |
| 7 | 3 | 2 |
+----+------+------+
Well, My problem is on this part: ... where col1,col2 IN (1,2). How can I solve it?
Note: I can do that like this: ... where col1 IN (1,2) or ,col2 IN (1,2). But this this way, I have to create two separate index on each column. While I want a query which need to a group-index like this: KEY NameIndex (col1, col2)

You want this, correct?
WHERE col1 IN (1,2)
OR col2 IN (1,2)
If so, turn the OR into a UNION. (This is a common optimization trick.)
( SELECT ... WHERE col1 IN (1,2) )
UNION DISTINCT -- since there are likely to be dups
( SELECT ... WHERE col2 IN (1,2) );
And provide the optimal index for each SELECT:
INDEX(col1),
INDEX(col2)
A composite index of those two columns will not suffice.
(Appologies -- this is probably a summary of best of the many disjointed comments.)

Distinct order-number sequence for every customer

I have table of orders. Each customer (identified by the email field) has his own orders. I need to give a different sequence of order numbers for each customer. Here is example:
----------------------------
| email | number |
----------------------------
| test#com.com | 1 |
----------------------------
| example#com.com | 1 |
----------------------------
| test#com.com | 2 |
----------------------------
| test#com.com | 3 |
----------------------------
| client#aaa.com | 1 |
----------------------------
| example#com.com | 2 |
----------------------------
Is possible to do that in a simple way with mysql?

If you want update data in this table after an insert, first of all you need a primary key, a simple auto-increment column does the job.
After that you can try to elaborate various script to fill the number column, but as you can see from other answer, they are not so "simple way".
I suggest to assign the order number in the insert statement, obtaining the order number with this "simpler" query.
select coalesce(max(`number`), 0)+1
from orders
where email='test1#test.com'
If you want do everything in a single insert (better for performance and to avoid concurrency problems)
insert into orders (email, `number`, other_field)
select email, coalesce(max(`number`), 0) + 1 as number, 'note...' as other_field
from orders where email = 'test1#test.com';
To be more confident about not assign at the same customer two orders with the same number, I strongly suggest to add an unique constraint to the columns (email,number)

create a column order_number
SELECT #i:=1000;
UPDATE yourTable SET order_number = #i:=#i+1;
This will keep incrementing the column value in order_number column and will start right after 1000, you can change the value or even you can even use the primary key as the order number since it is unique all the time

I think one more need column for this type of out put.
Example
+------+------+
| i | j |
+------+------+
| 1 | 11 |
| 1 | 12 |
| 1 | 13 |
| 2 | 21 |
| 2 | 22 |
| 2 | 23 |
| 3 | 31 |
| 3 | 32 |
| 3 | 33 |
| 4 | 14 |
+------+------+
You can get this result:
+------+------+------------+
| i | j | row_number |
+------+------+------------+
| 1 | 11 | 1 |
| 1 | 12 | 2 |
| 1 | 13 | 3 |
| 2 | 21 | 1 |
| 2 | 22 | 2 |
| 2 | 23 | 3 |
| 3 | 31 | 1 |
| 3 | 32 | 2 |
| 3 | 33 | 3 |
| 4 | 14 | 1 |
+------+------+------------+
By running this query, which doesn't need any variable defined:
SELECT a.i, a.j, count(*) as row_number FROM test a
JOIN test b ON a.i = b.i AND a.j >= b.j
GROUP BY a.i, a.j
Hope that helps!

You can add number using SELECT statement without adding any columns in table orders.
try this:
SELECT email,
(CASE email
WHEN #email
THEN #rownumber := #rownumber + 1
ELSE #rownumber := 1 AND #email:= email END) as number
FROM orders
JOIN (SELECT #rownumber:=0, #email:='') AS t

mysql sorting and ranking statement

I need some help in mysql statement
Ive table1 with 7 column and table 2 with 8 column the extra column named ranking , my statement should be like
select all from table 1 then sort it by " number of users " insert it in table 2 and ranking start 1 2 3 etc,
table 1 :
username | email | number of users
jack a#a.com 75
ralf b#b.com 200
anne c#c.com 12
sonny d#d.com 300
===================================
here where i need to INSERT and RANKING based on number of users
table 2
ranking | username | email | number of users
1
2
3

I would avoid to use another table. A single query suffices.
create table mytable (
id int not null auto_increment primary key,
username varchar(50),
email varchar(50),
number int
) engine = myisam;
insert into mytable (username,email,number)
values
('a','aaa',10),
('b','bbb',30),
('c','ccc',50),
('d','ddd',30),
('e','eee',20),
('f','fff',45),
('g','ggg',20);
select #r:=#r+1 as rnk,username,email,number
from mytable,(select #r:=0) as r order by number desc
+------+----------+-------+--------+
| rnk | username | email | number |
+------+----------+-------+--------+
| 1 | c | ccc | 50 |
| 2 | f | fff | 45 |
| 3 | b | bbb | 30 |
| 4 | d | ddd | 30 |
| 5 | e | eee | 20 |
| 6 | g | ggg | 20 |
| 7 | a | aaa | 10 |
+------+----------+-------+--------+
7 rows in set (0.00 sec)
This is a smarter version that considers ties
select #r:=#r + 1 as rn, username,email,
#pos:= if(#previous<>number,#r,#pos) as position,
#previous:=number as num
from mytable,(select #r:=0,#pos:=0,#previuos:=0) as t order by number desc
+------+----------+-------+----------+--------+
| rn | username | email | position | num |
+------+----------+-------+----------+--------+
| 1 | c | ccc | 1 | 50 |
| 2 | f | fff | 2 | 45 |
| 3 | b | bbb | 3 | 30 |
| 4 | d | ddd | 3 | 30 |
| 5 | e | eee | 5 | 20 |
| 6 | g | ggg | 5 | 20 |
| 7 | a | aaa | 7 | 10 |
+------+----------+-------+----------+--------+
7 rows in set (0.00 sec)

INSERT INTO table2
SELECT #rank := #rank + 1, table1.* FROM table1
JOIN( SELECT #rank := 0 ) AS init
ORDER BY number_of_users DESC

You need to do something like this:
SELECT * FROM `table1`
INNER JOIN `table2` USING ([a common filed name here])
ORDER BY table2.[the filed name here]
Good Luck!

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Duplicate records of a MySQL table - mysql

Hmm. You need a cross join of your table with a range. Something in a line of this: INSERT INTO table (id,value,data) SELECT id, value from table CROSS JOIN (SELECT 2 UNION SELECT 3 UNION ... SELECT 60) AS data; Use this answer Generating a range of numbers in MySQL as reference on number range.

Related

Creating a HIVE table that filters data from a .csv in HDFS based on the value in a column

How can I select the next available value in a column taking into account another column in SQL?

How to search in multiple columns?

Distinct order-number sequence for every customer

mysql sorting and ranking statement

Categories

Resources