How to fix no more spool space in teradata? - teradata-sql-assistant

My error is no more spool space when I create a table with the qualify number_rows,
the goal is to get the first 100 samples by key,
each key is composed by the following fields: (top_typ_vision, instid, don_gener3, don_gener4,rg_no, lieu_stkph_cd,id_sect_base_resp)
When I execute the select, the code works very well, once I add the create I get the error no more spool space
thank you !!
```sql
create multiset table mdc_cobalt_det as (
sel
top_typ_vision,
instid,
type_enr as type_obj_ofs,
don_gener1,
don_gener2,
don_gener3,
don_gener4,
rg_no,
lieu_stkph_cd,
id_sect_base_resp
from PROD_V_CTRL_ANOMALIE
qualify row_number () over (partition by top_typ_vision,
instid,
don_gener3,
don_gener4,
rg_no, lieu_stkph_cd,
id_sect_base_resp order by rg_no ) <= 100)
with data
primary index (top_typ_vision, rg_no, don_gener3, don_gener4, lieu_stkph_cd, id_sect_base_resp);

I suggest you to:
Collect statistics on your input table and try to run it again;
Create this mdc_cobalt_det table as a NOPI table and check data distribution in fields chosen to be your primary index.

Related

Delete Duplicates from large mysql Address DB

I know, deleting duplicates from mysql is often discussed here. But none of the solution work fine within my case.
So, I have a DB with Address Data nearly like this:
ID; Anrede; Vorname; Nachname; Strasse; Hausnummer; PLZ; Ort; Nummer_Art; Vorwahl; Rufnummer
ID is primary Key and unique.
And i have entrys for example like this:
1;Herr;Michael;Müller;Testweg;1;55555;Testhausen;Mobile;012345;67890
2;Herr;Michael;Müller;Testweg;1;55555;Testhausen;Fixed;045678;877656
The different PhoneNumber are not the problem, because they are not relevant for me. So i just want to delete the duplicates in Lastname, Street and Zipcode. In that case ID 1 or ID 2. Which one of both doesn't matter.
I tried it actually like this with delete:
DELETE db
FROM Import_Daten db,
Import_Daten dbl
WHERE db.id > dbl.id AND
db.Lastname = dbl.Lastname AND
db.Strasse = dbl.Strasse AND
db.PLZ = dbl.PLZ;
And insert into a copy table:
INSERT INTO Import_Daten_1
SELECT MIN(db.id),
db.Anrede,
db.Firstname,
db.Lastname,
db.Branche,
db.Strasse,
db.Hausnummer,
db.Ortsteil,
db.Land,
db.PLZ,
db.Ort,
db.Kontaktart,
db.Vorwahl,
db.Durchwahl
FROM Import_Daten db,
Import_Daten dbl
WHERE db.lastname = dbl.lastname AND
db.Strasse = dbl.Strasse And
db.PLZ = dbl.PLZ;
The complete table contains over 10Mio rows. The size is actually my problem. The mysql runs on a MAMP Server on a Macbook with 1,5GHZ and 4GB RAM. So not really fast. SQL Statements run in a phpmyadmin. Actually i have no other system possibilities.
You can write a stored procedure that will each time select a different chunk of data (for example by rownumber between two values) and delete only from that range. This way you will slowly bit by bit delete your duplicates
A more effective two table solution can look like following.
We can store only the data we really need to delete and only the fields that contain duplicate information.
Let's assume we are looking for duplicate data in Lastname , Branche, Haushummer fields.
Create table to hold the duplicate data
DROP TABLE data_to_delete;
Populate the table with data we need to delete ( I assume all fields have VARCHAR(255) type )
CREATE TABLE data_to_delete (
id BIGINT COMMENT 'this field will contain ID of row that we will not delete',
cnt INT,
Lastname VARCHAR(255),
Branche VARCHAR(255),
Hausnummer VARCHAR(255)
) AS SELECT
min(t1.id) AS id,
count(*) AS cnt,
t1.Lastname,
t1.Branche,
t1.Hausnummer
FROM Import_Daten AS t1
GROUP BY t1.Lastname, t1.Branche, t1.Hausnummer
HAVING count(*)>1 ;
Now let's delete duplicate data and leave only one record of all duplicate sets
DELETE Import_Daten
FROM Import_Daten LEFT JOIN data_to_delete
ON Import_Daten.Lastname=data_to_delete.Lastname
AND Import_Daten.Branche=data_to_delete.Branche
AND Import_Daten.Hausnummer = data_to_delete.Hausnummer
WHERE Import_Daten.id != data_to_delete.id;
DROP TABLE data_to_delete;
You can add a new column e.g. uq and make it UNIQUE.
ALTER TABLE Import_Daten
ADD COLUMN `uq` BINARY(16) NULL,
ADD UNIQUE INDEX `uq_UNIQUE` (`uq` ASC);
When this is done you can execute an UPDATE query like this
UPDATE IGNORE Import_Daten
SET
uq = UNHEX(
MD5(
CONCAT(
Import_Daten.Lastname,
Import_Daten.Street,
Import_Daten.Zipcode
)
)
)
WHERE
uq IS NULL;
Once all entries are updated and the query is executed again, all duplicates will have the uq field with a value=NULL and can be removed.
The result then is:
0 row(s) affected, 1 warning(s): 1062 Duplicate entry...
For newly added rows always create the uq hash and and consider using this as the primary key once all entries are unique.

Hive dynamic partition by unix timestamp

I am creating a table in Hive, running a mapper transformation and then saving a table. I want to partition the table based on when I ran the Hive query.
I create the table:
CREATE EXTERNAL TABLE IF NOT EXISTS testtable (
test_test STRING
) PARTITIONED BY (time STRING)
LOCATION 'loc/table'
;
Then run the transformation and save the table while trying this:
FROM (
MAP
one.test_test
USING
'python job.py'
AS test1
FROM
one
) test_step
INSERT OVERWRITE TABLE testtable PARTITION (time=unix_timestamp())
SELECT CAST ( test_step.test1 AS STRING ) AS test_test
;
However, when I do the
time=unix_timestamp()
, I get an exception. How would I go about doing this?
Thanks
I think it should work if you use dynamic partitioning (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-DynamicPartitionInserts). The partition field is just another column in the table, so if you have a value for the column in your query, then Hive will automatically put it in the right partition. So your statement would look something like this
FROM (
MAP
one.test_test
USING
'python job.py'
AS test1
FROM
one
) test_step
INSERT OVERWRITE TABLE testtable PARTITION (time)
SELECT CAST ( test_step.test1 AS STRING ) AS test_test,
unix_timestamp() as time
;
Doing it like this might create a lot of partitions though, as the value of unix_timestamp() will change during execution of the query. It would be better to use an extra statement to create the partition first and then insert.
EDIT: To add a partition beforehand you'd need to set the timestamp you want somehow, e.g. a parameter for the script. Then
ALTER TABLE testtable ADD PARTITION (time=your_timestamp_here);
This would go before your original query where you replace unix_timestamp() with your_timestamp_here (which of course would be a valid unix timestamp).

Store records in a new table created by a query in mysql

I have two tables ,location and locationdata. I want to query data from both the tables using join and to store the result in a new table(locationCreatedNew) which is not already present in the MySQL.Can I do this in MySQL?
SELECT location.id,locationdata.name INTO locationCreatedNew FROM
location RIGHT JOIN locationdata ON
location.id=locationdata.location_location_id;
Your sample code in OP is syntax in SQL Server, the counter part of that in MySQL is something like:
CREATE TABLE locationCreatedNew
SELECT * FROM location RIGHT JOIN locationdata
ON location.id=locationdata.location_location_id;
Referance: CREATE TABLE ... SELECT
For CREATE TABLE ... SELECT, the destination table does not preserve information about whether columns in the selected-from table are generated columns. The SELECT part of the statement cannot assign values to generated columns in the destination table.
Some conversion of data types might occur. For example, the AUTO_INCREMENT attribute is not preserved, and VARCHAR columns can become CHAR columns. Retrained attributes are NULL (or NOT NULL) and, for those columns that have them, CHARACTER SET, COLLATION, COMMENT, and the DEFAULT clause.
When creating a table with CREATE TABLE ... SELECT, make sure to alias any function calls or expressions in the query. If you do not, the CREATE statement might fail or result in undesirable column names.
CREATE TABLE newTbl
SELECT tbl1.clm, COUNT(tbl2.tbl1_id) AS number_of_recs_tbl2
FROM tbl1 LEFT JOIN tbl2 ON tbl1.id = tbl2.tbl1_id
GROUP BY tbl1.id;
NOTE: newTbl is the name of the new table you want to create. You can use SELECT * FROM othertable which is the query that returns the data the table should be created from.
You can also explicitly specify the data type for a column in the created table:
CREATE TABLE foo (a TINYINT NOT NULL) SELECT b+1 AS a FROM bar;
For CREATE TABLE ... SELECT, if IF NOT EXISTS is given and the target table exists, nothing is inserted into the destination table, and the statement is not logged.
To ensure that the binary log can be used to re-create the original tables, MySQL does not permit concurrent inserts during CREATE TABLE ... SELECT.
You cannot use FOR UPDATE as part of the SELECT in a statement such as CREATE TABLE new_table SELECT ... FROM old_table .... If you attempt to do so, the statement fails.
Please check it for more. Hope this help you.
Use Query like below.
create table new_tbl as
select col1, col2, col3 from old_tbl t1, old_tbl t2
where condition;

Error messase for part of the query:Either list of columns or a custom serializer should be specified

i am doing something wrong in this section of the query file:
CREATE TABLE LR_Charts;
INSERT INTO TABLE LR_Charts
select campid,CampNum,Count,Legend from tmp_LRchart1 Order By CampNum;
ALTER TABLE LR_Charts ADD COLUMNS (CountCumm INT);
Select tmp_LRchart1.campid, tmp_LRchart1.Count, SUM(LR_Charts.Count)
as LR_Charts.CountCumm from tmp_LRchart1, LR_Charts
where tmp_LRchart1.campid >= LR_Charts.campid
group by tmp_LRchart1.campid order by tmp_LRchart1.campid;
Kindly help.
The statement
CREATE TABLE LR_Charts;
is wrong.
You are trying to create table without specifying list of columns for it.
It should be like:
CREATE TABLE LR_Charts( i int, v varchar(10) );
But looking at your statements what I understood is that,
you are trying to create a table with data from another table.
If that is right, then your query should be like this:
CREATE TABLE LR_Charts AS
select campid,CampNum,Count,Legend from tmp_LRchart1 Order By CampNum;

Duplicate record in mySQL

I have a mySQL db with duplicate records, as from the attached image.
I am asking for a query to delete all duplicate records based on date + time, for all tables (foreachtables) in db
Thanks
As far I could see, you dont have autoincrement primary key or foreign key.
If you dont have tables with foreign key or relation between, first you can list all your tables. After that, you can create a temporal "mirror" of one table (for eg, autogrill).
Then you can do a:
INSERT INTO TemporalTable
SELECT DISTINCT
or a
INSERT INTO TemporalTable
SELECT Id, Date, Time FROM autogrill GROUP BY Id, Date, Time HAVING COUNT(*) > 1
.
TRUNCATE or DELETE FROM
without where and then put again your data with
INSERT INTO autogrill
SELECT * FROM TemporalTable
BE AWARE if you have primary keys doing this.
How about you create and STORED PROCEDURE for this?
DELIMITER $$
CREATE PROCEDURE `DeleteDup`()
BEGIN
-- Drops the table.
DROP TABLE bad_temp;
-- Creates a temporary table for distincts record.
CREATE TABLE bad_temp(id INT, name VARCHAR(20));
-- Selects distinct record and inserts it on the temp table
INSERT INTO bad_temp(id,name) SELECT DISTINCT id,name FROM bad_table;
-- Delete All Entries from the table which contains duplicate
-- (you can add also condition on this)
DELETE FROM bad_table;
-- Selects all records from temp table and
-- inserts back in the orginal table
INSERT INTO bad_table(id,name) SELECT id,name FROM bad_temp;
-- Drops temporary table.
DROP TABLE bad_temp;
END$$
DELIMITER ;
Please change tablename and column name to your desired schema.
so when you finish creating your STORED PROCEDURE, you can use it like this:
CALL DeleteDup();
You can export your table using this request :
SELECT * FROM autogrill GROUP BY id
Then, empty your table, and import the export you made before. I don't know another easy way to erase duplicate entries using only a single request.
One easy way to do this is to copy all the distinct records into a new table or an export. Then delete all records in the original table and copy them back in.
Export NULL if table have autoincrement an for source use alias name, example :
INSERT INTO product
SELECT NULL,p.product_sku,
p.product_s_desc,
p.product_desc,
p.product_thumb_image,
p.product_full_image,
p.product_tech_data,
p.product_publish,
p.product_weight,
p.product_weight_uom,
p.product_length,
p.product_width,
p.product_height,
p.product_lwh_uom,
p.product_url,
p.product_in_stock,
p.product_available_date,
p.product_special,
p.create_date,
p.modify_date,
p.product_name,
p.attribute
FROM product AS p WHERE p.product_id=xxx;