I'm not a MySQL guy, actually I'm doing this to help a friend.
I have these tables in a MySQL database:
create table post (ID bigint, p text)
create table user (ID bigint, user_id bigint)
and I'm querying them by this script:
select * from post
where ID in (select user_id from user where ID = 50)
order by ID DESC --this line kills performance.
limit 0,20
As I mentioned in comment, when there is no order by ID DESC, the query executes very fast. But when I add that to the query, it got very very slow with a huge CPU usage. Do you have any idea what am I doing wrong?
You should define ID as Primary Key for your table. This will add an index and increase performance. At least as a first step, it's a good one.
This query should do the trick:
create table post (
ID bigint,
p text,
PRIMARY KEY (ID));
Thanks to #frlan the problem got solved by indexes:
CREATE INDEX IDX_POST_ID ON post (ID);
CREATE INDEX IDX_USER_ID ON user (ID);
CREATE INDEX IDX_USER_USERID ON user (user_id);
Related
I was using MYSQL database but one of my history table growing very fast already more than 300 Million rows which making database slow and difficult to create backups. So i decided move just that table in Cassandra. It's my first time on Cassandra. In mysql I'm storing user_id, video_id, watch_secs, watch_counter, timestamp, user_id,video_id is unique composite key and increment watch_secs and watcher_counter if already exists. I tried to do following with Cassandra
CREATE TABLE IF NOT EXISTS history
(
user_id int,
video_id int,
watch_secs int,
watch_counter int,
last_updated timestamp,
history_timestamp timestamp,
PRIMARY KEY ((user_id, video_id))
);
CREATE TABLE IF NOT EXISTS history_counter
(
user_id int,
video_id int,
watch_secs counter,
watch_counter counter,
PRIMARY KEY ((user_id, video_id))
);
I have created two tables for incrementing seconds and counter and other table same data with timestamps because limitations due to counter.
Now that is working good for storing data but here i have two issues deleting and getting data.
I want to fetch history of for last 10 for specific user. I tried
query but it need both user_id and video_id in where clause.
I want to delete history by video_id
So main issue if fetching or deleting data with only one partition key which is not working and I can't find any solution.
I will really appreciate your help and I can use any other database which will fit better for this or any solution in this database.
SELECT ...
FROM history
WHERE user_id = ?
ORDER BY history_timestamp DESC
LIMIT 10
and add this to the table history:
INDEX(user_id, history_timestamp)
That probably needs a JOIN using video_id to some other table to get the names of the 10 videos.
(What is history_counter for? The current state of someone viewing a video? Something else?)
So I have an existing MySQL users table with thousands of records in it. I have noticed duplicate records for users which is a problem that I need to address. I know that the way I need to do this is to somehow make 2 columns unique.
The duplicates are arising with records containing both the same server_id column, and also the same user_id column. These 2 columns are meant to be unique combined. So there should only ever be 1 user_id per server_id.
I have figured out how I can find these duplicates using the following query:
SELECT `server_id`, `user_id`, COUNT(*) AS `duplicates` FROM `guild_users` GROUP BY `server_id`, `user_id` HAVING `duplicates` > 1
From what I have read, I need to delete all duplicates first before I add any constraints. This is one of the things I am unsure about.
Question 1: How would I go about deleting all duplicates, but leaving 1 of each so the user still exists, just not the other duplicates.
Question 2: What is the best way of avoiding duplicates from being created? Should I create a unique constraint for both of the columns, or do something with primary keys instead?
In your table there must exist a primary key column like an id.
So you can use EXISTS to delete the duplicates and keep just 1:
delete gu from guild_users gu
where exists (
select 1 from guild_users
where
server_id = gu.server_id
and
user_id = gu.user_id
and
id > gu.id
)
After that you can create a unique constraint for the 2 columns:
alter table guild_users
add constraint un_server_user unique
(server_id, user_id);
You want to prevent this by adding a unique index:
create unique index unq_guild_users_server_user on guild_users(server_id, user_id);
If you have a primary key, you can delete the duplicates before adding the unique index:
delete g
from guild_users g left join
(select server_id, user_id, max(primary_key) as max_pk
from guild_users
group by server_id, user_id
) su
on gu.primary_key = su.max_pk
where su.max_pk is null;
I would like to know how MySql handle the indexes priority. I have the following table.
CREATE TABLE table (
colum1 VARCHAR(50),
colum2 VARCHAR(50),
colum3 ENUM('a', 'b', 'c'),
PRIMARY KEY(colum1, colum2, colum3)
);
CREATE INDEX colum1_idx ON table (colum1);
CREATE INDEX coloum2_idx ON table (colum2);
const query = `SELECT * FROM table
WHERE colum1 = ?
ORDER BY colum2
LIMIT ?,?`;
Basically my PK is composed by all fields (I need to use INSERT IGNORE) and I am query using colum1 as WHERE clause and ORDER by colum2.
My question is should I create 2 different indexes or create 1 index with (colum1 and colum2)?
Thanks to #JuanCarlosOpo
I find the answer here: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#algorithm_step_2c_order_by_
It's more performant using a compound index using both columns.
CREATE INDEX colum_idx ON table (colum1,colum2);
Thanks a lot!
I have two sql tables called scan_sc and rescan_rsc. The scan table looks like this:
CREATE TABLE scan_sc
(
id_sc int(4),
Type_sc varchar(255),
ReScan_sc varchar(255),
PRIMARY KEY (id_sc)
)
When a scan a document I insert a row into the scan table. If the result of this scanning is poor I have to do a rescan, and therefore I have a rescan table.
CREATE TABLE rescan_rsc
(
id_rsc int(4),
Scan_rsc varchar(255),
PRIMARY KEY (id_rsc)
)
The problem is, I want to have a trigger that will fill in the column ReScannet_sc with an "x", in the scan_sc table, so I can see that there has been some problems here.
The trigger has to do it where the id from the rescan table is the same as in the scan table.
Hope you all understand my question.
Thanks in advance.
Do you really need the ReScan_sc column and the trigger?
With a simple JOIN, you can find out the records in your scan_sc table that have been re-scanned, without using the ReScan_sc column at all.
There are several possibilities:
Show all scans, with an additional column with the Rescan ID, if any:
SELECT scan_sc.*, rescan_sc.id_rsc
FROM scan_sc
LEFT JOIN rescan_sc ON scan_sc.id_sc = rescan_sc.id_rsc
Show only the scans which have been re-scanned:
SELECT scan_sc.*
FROM scan_sc
INNER JOIN rescan_sc ON scan_sc.id_sc = rescan_sc.id_rsc
(I assume that id_sc and id_rsc are the primary keys and that PRIMARY KEY (id_sd) is a typo, like marc_s pointed out in his comment)
I have a huge table of products but there are lot of duplicate entries. The table has more than10 Thousand entries and I want to remove the duplicate entries in it without manually finding and deleting it. Please let me know if you can provide me a solution for this
You could use SELECT DISTINCT INTO TempTable, drop the original table, and then rename the temp one.
You should also add primary and unique keys to avoid this sort of thing in the future.
for full row duplicates try this.
select distinct * into mytable_tmp from mytable
drop table mytable
alter table mytable_tmp rename mytable
Seems the below statements will help you in resolving your requirements.
if the table(foo) has primary key field
First step
store key values in temporary table, give your unique conditions in group by clause
if you want to delete the duplicate email id, give email id in group by clause and give the primary key name in
select clause like either min(primarykey) or max(primarykey)
CREATE TEMPORARY TABLE temptable AS SELECT min( primarykey ) FROM foo GROUP BY uniquefields;
Second step
call the below delete statement and give the table name and primarykey columns
DELETE FROM foo WHERE primarykey NOT IN (SELECT * FROM temptable );
execute both the query combined in your query analyser or db tool.
If the table(foo) doesn't have a primary key filed
step 1
CREATE TABLE temp_table AS SELECT * FROM foo GROUP BY field or fileds;
step 2
DELETE FROM foo;
step 3
INSERT INTO foo select * from temp_table;
There are different solutions to remove duplicate rows and it fully depends upon your scenario to make use of one from them. The simplest method is to alter the table making the Unique Index on Product Name field:
alter ignore table products add unique index `unique_index` (product_name);
You can remove the index after getting all the duplicate rows deleted:
alter table products drop index `unique_index`;
Please let me know if this resolves the issue. If not I can give you alternate solutions for that.
You can add more than one column to a group by. I.E.
SELECT * from tableName GROUP BY prod_name HAVING count(prod_name) > 1
That will show the unique products. You can write it dump it to new table and drop the existing one.