Improve MySQL searches on huge tables

Improve MySQL searches on huge tables - mysql

How to deal with thousands of tuples in a table? How can be searching improved if there is no primary key in my table?
ex:
id attr
1 I'm
1 Too
1 Damn
2 Slow
2 To
2 Search
I can group the data together using group_concat() but i'm unsure that will it search my complete table to get the end result? And if yes, then how it can be improved?

Create an index on column you want to use in search query to improve search.
e.g if your table is CREATE TABLE T1(A INT PRIMARY KEY, B INT, C CHAR(1));
then index can create using this on column B, CREATE INDEX B ON T1 (B);

Related

multi-to-multi indexing issue, mysql

i have a multi-to-multi table, this is going to have millions of rows. Let me describe my confusion with an example.
example:
table: car_dealer_rel
opt:1
columns: car_id: int unsigned, dealer_id: int unsigned
index on: car_id, dealer_id
car_id|dealer_id
-------|---------
1 | 1
1 | 2
....
sub-opt:1: Here I can have one index on both columns.
sub-opt:2: One combined index on 2 columns.
opt-2:
one column table
col: car_id_dealer_id: varchar:21
index on: PKI on this single column.
Here idea is to put values as: car_id.dealer_id and do searches as %.xxx and or xxx.%
car_id_dealer_id
----------------
1.1
1.2
1.15
2.10
...
...
after millions of records which will be faster for:
read from
add/update/delete.
I am novice on MySQL, all help is appreciated.

with first one
car_id|dealer_id
-------|---------
1 | 1
1 | 2
you can easlily create composite index fo both sides
create index ind1 on car_dealer_rel (car_id,dealer_id );
create index ind2 on car_dealer_rel (dealer_id, car_id );
that work very fast
and you can easily filter in both the sense
where car_id = your_value
or
where dealer_id = another_value
or using both
with the second one you can't do this easily( you need frequently string manipulation and this don't let you use the index) and in some condition you can't do using sql
and for update, insert and delete the performance remain pratically the same

It depends on the actual query that you use and I suggest run EXPLAIN first, with quite many dummy data, to understand how MySQL is going to execute your query.
But if you are going to find records by column car_id alone or car_id and dealer_id, you can use composite index (car_id, dealer_id).
If you also want to find by dealer_id alone, you can add additional index on dealer_id column.
Your one column table option is not very good because
You cannot find rows by dealer_id fast.
Table schema is not normalized.

MySql Indexes Sort and Where

I would like to know how MySql handle the indexes priority. I have the following table.
CREATE TABLE table (
colum1 VARCHAR(50),
colum2 VARCHAR(50),
colum3 ENUM('a', 'b', 'c'),
PRIMARY KEY(colum1, colum2, colum3)
);
CREATE INDEX colum1_idx ON table (colum1);
CREATE INDEX coloum2_idx ON table (colum2);
const query = `SELECT * FROM table
WHERE colum1 = ?
ORDER BY colum2
LIMIT ?,?`;
Basically my PK is composed by all fields (I need to use INSERT IGNORE) and I am query using colum1 as WHERE clause and ORDER by colum2.
My question is should I create 2 different indexes or create 1 index with (colum1 and colum2)?

Thanks to #JuanCarlosOpo
I find the answer here: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#algorithm_step_2c_order_by_
It's more performant using a compound index using both columns.
CREATE INDEX colum_idx ON table (colum1,colum2);
Thanks a lot!

Remove duplicate values without ID

I have a table like this:
uuid | username | first_seen | last_seen | score
Before, the table used the primary key of a "player_id" column that ascended. I removed this player_id as I no longer needed it. I want to make the 'uuid' the primary key, but there's a lot of duplicates. I want to remove all these duplicates from the table, but keep the first one (based off the row number, the first row stays).
How can I do this? I've searched up everywhere, but they all show how to do it if you have a row ID column...

I highly advocate having auto-incremented integer primary keys. So, I would encourage you to go back. These are useful for several reasons, such as:
They tell you the insert order of rows.
They are more efficient for primary keys.
Because primary keys are clustered in MySQL, they always go at the end.
But, you don't have to follow that advice. My recommendation would be to insert the data into a new table and reload into your desired table:
create temporary table tt as
select t.*
from tt
group by tt.uuid;
truncate table t;
alter table t add constraint pk_uuid primary key (uuid);
insert into t
select * from tt;
Note: I am using a (mis)feature of MySQL that allows you to group by one column while pulling columns not in the group by. I don't like this extension, but you do not specify how to choose the particular row you want. This will give values for the other columns from matching rows. There are other ways to get one row per uuid.

MYSQL Query Tuning for updating data of one table by data of next table

I do have two tables:
1. PersonAddressList [About 5,000 records]
Columns:
ID int
TITLE varchar
CITY varchar
2. CityList [About 5,000 recods]
Columns:
ID int
City_Name varchar
City_State int //[RK to State]
Previous designer had added city names directly in table 1 [personaddresslist]. Now I am normalising it and replacing the city name in table 1 with city id in table 2
Query I have used:
Update personaddresslist, CityList set CITY = cityList.ID where CITY =
City_name
The above query runs good if the tables have less data, but keeps on rolling n rolling in case of both tables has large no of data. In my real scenario I do medium set of data about 5000 records in each table.
So how do can we tune it fine.
Regards,
Kabindra
Edit 1:
Regarding the result from above query, it took me nearly 40 mins to complete the running of script, Since I need to run the similar script on other more tables, I would like to fine tune and make it faster.

Your table needs some modification and indexes to make it faster.
First thing you are storing the city in PersonAddressList table and the data type is varchar so even if its indexed it will never use it since they are of different data type in both tables.
Then use of proper index.
I will start with
alter table PersonAddressList add index city_idx(CITY);
alter table CityList add index City_Name_idx(City_Name);
Then will use the following update command
update PersonAddressList p
join CityList c on c.City_Name = p.CITY
set p.CITY = c.ID
The above query will be faster, just make sure that both CITY and City_Name are of same data type with the same size before applying the indexes.
Once the data is updated then you need to fix the structure
drop index city_idx from PersonAddressList ;
alter table PersonAddressList change CITY CITY int ;
alter table PersonAddressList add index city_idx(CITY);
and finally make sure that the CityList ID is indexed and if its primary key which is most likely it will be indexed by default.

Merge data from 2 tables, use only unique rows

I have 2 tables in my database
primary_id
primary_date
primary_measuredData
temporary_id
temporary_date
temporary_measuredData
well. the table have other columns but these are the important ones.
What I want is the following.
Table "primary" consists of verified measuredData.If data is available here, the output should choose first from primary, and if not available in primary, choose from temporary.
In about 99.99% of the cases all old data is in the primary, and only the last day is from the temporary table.
Example:
primary table:
2013-02-05; 345
2013-02-07; 123
2013-02-08; 3425
2013-02-09; 334
temporary table:
2013-02-06; 567
2013-02-07; 1345
2013-02-10; 31
2013-02-12; 33
I am looking for the SQL query that outputs:
2013-02-05; 345 (from primary)
2013-02-06; 567 (from temporary, no value available from prim)
2013-02-07; 123 (from primary, both prim & temp have this date so primary is used)
2013-02-08; 3425 (primary)
2013-02-09; 334 (primary)
2013-02-10; 31 (temp)
2013-02-12; 33 (temp)
you see, no duplicate dates and if data is avalable at primary table then the data is used from that one.
I have no idea how to solve this, so I cant give you any "this is what I've done so far :D"
Thanks!
EDIT:
The value of "measuredData" can differ from temp and primary. This is because temp is used to store a temporary value, and later when the data is verified it goes into the primary table.
EDIT 2:
I changed the primary table and added a new column "temporary". So that I store all the data in the same table. When the primary data is updated it updates the temporary data with the new numbers. This way I dont need to merge 2 tables into one.

You should start with a UNION QUERY like this:
SELECT p.primary_date AS dt, p.primary_measuredData as measured
FROM
`primary` p
UNION ALL
SELECT t.temporary_date, t.temporary_measuredData
FROM
`temporary` t LEFT JOIN `primary` p
ON p.primary_date=t.temporary_date
WHERE p.primary_date IS NULL
a LEFT JOIN where there's no match (p.primary_date IS NULL) will return all rows from the temporary table that are not present in the primary table. And using UNION ALL you can return all rows available in the first table.
You might want to add an ORDER BY clause to the whole query. Please see fiddle here.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Improve MySQL searches on huge tables - mysql

Create an index on column you want to use in search query to improve search. e.g if your table is CREATE TABLE T1(A INT PRIMARY KEY, B INT, C CHAR(1)); then index can create using this on column B, CREATE INDEX B ON T1 (B);

Related

multi-to-multi indexing issue, mysql

MySql Indexes Sort and Where

Remove duplicate values without ID

MYSQL Query Tuning for updating data of one table by data of next table

Merge data from 2 tables, use only unique rows

Categories

Resources