multi-to-multi indexing issue, mysql - mysql

i have a multi-to-multi table, this is going to have millions of rows. Let me describe my confusion with an example.
example:
table: car_dealer_rel
opt:1
columns: car_id: int unsigned, dealer_id: int unsigned
index on: car_id, dealer_id
car_id|dealer_id
-------|---------
1 | 1
1 | 2
....
sub-opt:1: Here I can have one index on both columns.
sub-opt:2: One combined index on 2 columns.
opt-2:
one column table
col: car_id_dealer_id: varchar:21
index on: PKI on this single column.
Here idea is to put values as: car_id.dealer_id and do searches as %.xxx and or xxx.%
car_id_dealer_id
----------------
1.1
1.2
1.15
2.10
...
...
after millions of records which will be faster for:
read from
add/update/delete.
I am novice on MySQL, all help is appreciated.

with first one
car_id|dealer_id
-------|---------
1 | 1
1 | 2
you can easlily create composite index fo both sides
create index ind1 on car_dealer_rel (car_id,dealer_id );
create index ind2 on car_dealer_rel (dealer_id, car_id );
that work very fast
and you can easily filter in both the sense
where car_id = your_value
or
where dealer_id = another_value
or using both
with the second one you can't do this easily( you need frequently string manipulation and this don't let you use the index) and in some condition you can't do using sql
and for update, insert and delete the performance remain pratically the same

It depends on the actual query that you use and I suggest run EXPLAIN first, with quite many dummy data, to understand how MySQL is going to execute your query.
But if you are going to find records by column car_id alone or car_id and dealer_id, you can use composite index (car_id, dealer_id).
If you also want to find by dealer_id alone, you can add additional index on dealer_id column.
Your one column table option is not very good because
You cannot find rows by dealer_id fast.
Table schema is not normalized.

Related

Does Relations between tables in database speed up performance of queries?

I am using join in my quires and i want to know if the relations between database tables leads to increase performance of the queries.
Thank You.
For boosting performance, you should use indexes, use appropriate datatypes as well (storing number as string takes more space and comparing may be less efficient).
Relations between tables, i.e. foreign key are constraints, so you cannot enter new value to referenced table without referencing records in other table - it is a way to keep data integrity, eg.
Table1
id table2_id
1 1
2 1
3 3
Table2
id some_column
1 123
2 123
3 null
Here, Table1.table2_id references Table2.id. Now you won't be able to insert such row to Table1: 4, 4, because there's no id = 4 in Table2.

Mysql partition - How to do list partitioning of a table that contains unique column?

I am doing mysql list partitioning. my table data is as below
----------------------------------------
id | unique_token | city | student_name |
----------------------------------------
1 | xyz |mumbai| sanjay |
-----------------------------------------
2 | abc |mumbai| vijay |
----------------------------------------
3 | def | pune | ajay |
----------------------------------------
In the above table unique_token column has a unique key and i want to do list partitioning with city column. As per mysql documentation every partition column must be part of every unique key of a table and hence in order to do list partitioning with city column i have to create new unique key as unique_key(unique_token,city).
Now the issue is that unique_token column should be unique and if i insert two rows in the table as ('xyz','banglore') and ('xyz','pune') then these rows will be inserted into the table but then unique_token column won't be unique at all.
I want to know how to do list partitioning on this table without having duplicate data in unique_token column??
There are limitations in MySQL's PARTITION implementation. In particular, no FOREIGN KEYs and no UNIQUE keys unless they happen to include the "partition key". These limitation exist because of the unacceptable cost of implementing them. This, in turn, is caused by each partition being essentially a separate 'table', with its own indexes. There is no "index" that spans the entire set of partitions. Such a 'global index' would make FKs and UNIQUE keys viable and efficient. This may come in version 5.8.
Meanwhile, let me change your question from "How to do LIST partitioning..." to "Why do LIST partitioning at all?". I know of no utility -- not performance, not convenience, not anything else, for PARTITION BY LIST. If you have a reason for wanting to do it, please explain. I would be happy to change my rather negative attitude toward partitioning. (I know of only 4 use cases for PARTITION BY RANGE, but that is another topic.)
Better to give composite primary key for (unique_token and city) columns
alter table table_name add constraint constraint_name primary
key(unique_token and city).

Remove duplicate values without ID

I have a table like this:
uuid | username | first_seen | last_seen | score
Before, the table used the primary key of a "player_id" column that ascended. I removed this player_id as I no longer needed it. I want to make the 'uuid' the primary key, but there's a lot of duplicates. I want to remove all these duplicates from the table, but keep the first one (based off the row number, the first row stays).
How can I do this? I've searched up everywhere, but they all show how to do it if you have a row ID column...
I highly advocate having auto-incremented integer primary keys. So, I would encourage you to go back. These are useful for several reasons, such as:
They tell you the insert order of rows.
They are more efficient for primary keys.
Because primary keys are clustered in MySQL, they always go at the end.
But, you don't have to follow that advice. My recommendation would be to insert the data into a new table and reload into your desired table:
create temporary table tt as
select t.*
from tt
group by tt.uuid;
truncate table t;
alter table t add constraint pk_uuid primary key (uuid);
insert into t
select * from tt;
Note: I am using a (mis)feature of MySQL that allows you to group by one column while pulling columns not in the group by. I don't like this extension, but you do not specify how to choose the particular row you want. This will give values for the other columns from matching rows. There are other ways to get one row per uuid.

Merge data from 2 tables, use only unique rows

I have 2 tables in my database
primary_id
primary_date
primary_measuredData
temporary_id
temporary_date
temporary_measuredData
well. the table have other columns but these are the important ones.
What I want is the following.
Table "primary" consists of verified measuredData.If data is available here, the output should choose first from primary, and if not available in primary, choose from temporary.
In about 99.99% of the cases all old data is in the primary, and only the last day is from the temporary table.
Example:
primary table:
2013-02-05; 345
2013-02-07; 123
2013-02-08; 3425
2013-02-09; 334
temporary table:
2013-02-06; 567
2013-02-07; 1345
2013-02-10; 31
2013-02-12; 33
I am looking for the SQL query that outputs:
2013-02-05; 345 (from primary)
2013-02-06; 567 (from temporary, no value available from prim)
2013-02-07; 123 (from primary, both prim & temp have this date so primary is used)
2013-02-08; 3425 (primary)
2013-02-09; 334 (primary)
2013-02-10; 31 (temp)
2013-02-12; 33 (temp)
you see, no duplicate dates and if data is avalable at primary table then the data is used from that one.
I have no idea how to solve this, so I cant give you any "this is what I've done so far :D"
Thanks!
EDIT:
The value of "measuredData" can differ from temp and primary. This is because temp is used to store a temporary value, and later when the data is verified it goes into the primary table.
EDIT 2:
I changed the primary table and added a new column "temporary". So that I store all the data in the same table. When the primary data is updated it updates the temporary data with the new numbers. This way I dont need to merge 2 tables into one.
You should start with a UNION QUERY like this:
SELECT p.primary_date AS dt, p.primary_measuredData as measured
FROM
`primary` p
UNION ALL
SELECT t.temporary_date, t.temporary_measuredData
FROM
`temporary` t LEFT JOIN `primary` p
ON p.primary_date=t.temporary_date
WHERE p.primary_date IS NULL
a LEFT JOIN where there's no match (p.primary_date IS NULL) will return all rows from the temporary table that are not present in the primary table. And using UNION ALL you can return all rows available in the first table.
You might want to add an ORDER BY clause to the whole query. Please see fiddle here.

Improve MySQL searches on huge tables

How to deal with thousands of tuples in a table? How can be searching improved if there is no primary key in my table?
ex:
id attr
1 I'm
1 Too
1 Damn
2 Slow
2 To
2 Search
I can group the data together using group_concat() but i'm unsure that will it search my complete table to get the end result? And if yes, then how it can be improved?
Create an index on column you want to use in search query to improve search.
e.g if your table is CREATE TABLE T1(A INT PRIMARY KEY, B INT, C CHAR(1));
then index can create using this on column B, CREATE INDEX B ON T1 (B);