performance in 4.5 create gsi index - couchbase

I have below query.
SELECT DailyCampaignUsage.day date, sum (ARRAY_SUM(DailyCampaignUsage.`statistics`[*].clicks)) clicks,
sum (ARRAY_SUM(DailyCampaignUsage.`statistics`[*].clicksCost)) revenue
FROM Inheritx DailyCampaignUsage
JOIN Inheritx Campaign ON KEYS ('Campaign|'||TOSTRING(DailyCampaignUsage.campaignId))
JOIN Inheritx Users on keys('User|'|| TOSTRING(Campaign.`user`))
WHERE DailyCampaignUsage._type='DailyCampaignUsage' and
DATE_PART_MILLIS(STR_TO_MILLIS(DailyCampaignUsage.day),'year')=2016
and DATE_PART_MILLIS(STR_TO_MILLIS(DailyCampaignUsage.day),'month')=5
group by DailyCampaignUsage.day
order by DailyCampaignUsage.day
I have no only index on _type
like
CREATE INDEX `Ottoman__type` ON `Inheritx`(`_type`)
when I run above query it is taking 10s
When I try to create some index like
CREATE INDEX `dailyCampaignUsage_type_clicks_cost` ON
`Inheritx`(_type,day,`statistics`[*].clicks,`statistics`[*].clicksCost) WHERE
`_type` = "DailyCampaignUsage" USING GSI
But It is not working it taking more time 13s.
I have also use use index (dailyCampaignUsage_type_clicks_cost)
But do not work.
which Index should I create ?

can you post the sample document, EXPLAIN output, and how many documents you have, couchbase version, cluster setup (all services on same node, or using MDS-multi dimensional scaling) etc.
You may want to try including the year, month as index keys..
CREATE INDEX dailyCampaignUsage_type_clicks_cost ON
Inheritx(_type, DATE_PART_MILLIS(STR_TO_MILLIS(DailyCampaignUsage.day),'year'), DATE_PART_MILLIS(STR_TO_MILLIS(DailyCampaignUsage.day),'month'), day...) WHERE ...

Related

Access Database Slow Finding Any Records Not Matching

My Access Database is slow when finding non-matching records
SELECT
RT3_Data_Query.Identifier, RT3_Data_Query.store, RT3_Data_Query.SOURCE,
RT3_Data_Query.TRAN_CODE, RT3_Data_Query.AMOUNT,
RT3_Data_Query.DB_CR_TYPE, RT3_Data_Query.status,
RT3_Data_Query.TRAN_DATE, RT3_Data_Query.ACCEPTED_DATE,
RT3_Data_Query.RECONCILED_DATE
FROM
RT3_Data_Query
LEFT JOIN Debit_AO_Query ON RT3_Data_Query.[Identifier] = Debit_AO_Query.[Identifier]
WHERE
(((Debit_AO_Query.Identifier) Is Null));
I'm doing a query of two queries I created. The last query is just to compare these two queries and show what is missing between them which is what i posted above. I'm matching an identifier between the two queries which looks like this 583005-01-20185804.33 which is a combination of store, date and amount.
Here is a link to the database:
https://wetransfer.com/downloads/15f912909fbe2ea0a5111e44b953d11a20190808195913/db9912
The query is slow because you don't use indexes on tables and join on concated fields (Identifier is Location & Date & Total)!
Each table needs a primary key or it is not a table! That should be an autonumber for the beginning!
Indexing:
Add a field called id to each table, datatype autonumber and make it PK.
Add a key for the fields compared in the join and the where clause (set all index properties (primary, unique, ignore) to no)!
for table RT3_Data (because it is huge create a copy first, then delete the data, or creating index will fail onMaxLocksPerFile):
store
AMOUNT
TRAN_DATE
after that reimport data from copy with query:
INSERT INTO RT3_DATA
SELECT [Copy Of RT3_DATA].*
FROM [Copy Of RT3_DATA];
for table Debit_AO:
Location
Total
Date (should be renamed as Date() is a VBA-Function)
Now change the queryRT3_Data_Query Without Matching Debit_AO_Queryto:
SELECT RT3_Data.store
,RT3_Data.SOURCE
,RT3_Data.TRAN_CODE
,RT3_Data.AMOUNT
,RT3_Data.DB_CR_TYPE
,RT3_Data.STATUS
,RT3_Data.TRAN_DATE
,RT3_Data.ACCEPTED_DATE
,RT3_Data.RECONCILED_DATE
FROM RT3_Data
LEFT JOIN Debit_AO
ON RT3_Data.[store] = Debit_AO.[Location]
AND RT3_Data.[AMOUNT] = Debit_AO.[Total]
AND RT3_Data.[TRAN_DATE] = Debit_AO.[DATE]
WHERE (
(
Debit_AO.Location IS NULL
AND Debit_AO.Total IS NULL
AND Debit_AO.[Date] IS NULL
)
);
Now the query executes in less than 10 seconds and for sure there are more optimizations (e.g composite index).

MySQL trigger, view, separate table, or on-the-fly calculation for loyalty points?

What is the least resource-intensive way to calculate a sum of points from two tables? The total point tally is calculated by adding points from a table points and subtracting points from a table points_redeemed.
points:
CREATE TABLE IF NOT EXISTS points(
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
user__id INT,
tx__id INT,
points INT
) ENGINE=MyISAM;
points_redeemed:
CREATE TABLE IF NOT EXISTS points_redeemed(
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
user__id INT,
points INT
) ENGINE=MyISAM;
(Both tables above are heavily simplified.)
points is populated upon a transaction (recorded in a different table). When transaction values are changed or voided, the corresponding row in points is updated as well.
points_redeemed is populated when user redeems their accumulated points for a reward.
Use cases:
show stats to user and admin: total, redeemed, and unredeemed points
check unredeemed points upon user-initiated redeem request
The options I've came up with are:
a) Triggers.
Create a table points_sum with one row per user.id and add three triggers:
on insert into points
on update of points
on insert into points_redeemed
I've heard that MySQL triggers are not that performant though, so I'm simply not sure if this is a good idea.
b) View.
Create a view that calculates points.points - points_redeemed.points. Not sure if this is any better than just doing it on the fly.
c) Sum table.
Create a table points_sum and update it per separate query each time points and points_redeemed is inserted into and updated. This feels like the least effective way, but then again I could be wrong and it might be the best way.
d) On the fly.
Query points from both tables on the fly and calculate the difference. This is the easiest and probably the most accurate way, but it can potentially clog up the pipes a lot when the tables grow in size. Then again, are any of the other options any better in that regard?
Edit: These are the current on-the-fly queries.
First, a very straight-forward query from points_redeemed:
SELECT *
FROM points_redeemed
WHERE user__id = 1
Second, the points table is queried:
(
SELECT p.*,
tx.*
FROM points p
INNER JOIN tx ON p.tx__id = tx.id
WHERE p.user__id = '1'
AND p.tx_is_external IS NULL
ORDER BY p.date DESC
)
UNION
(
SELECT p.*,
tx.*
FROM points p
INNER JOIN tx_external tx ON p.tx__id = tx.id
WHERE p.user__id = '1'
AND p.tx_is_external = '1'
ORDER BY p.date DESC
)
(There are several named columns SELECTed that I abbrieviated as * here. In the second query, about 40 columns are fetched per row.)
After this, I'm looping through both result sets and adding/subtracting points on the app layer.
My worry is that the two separate queries, and the joins in the second query, might "clog the pipes" when the tx tables grow in size (and the points table too). That's why I'm trying to figure out a better way that will save resources at runtime.
The more I think about it though... transactions and points inserts will probably happen a lot more frequently than a user looking up their current point status. In that scenario, a trigger would probably have the opposite effect.
I'd appreciate any kind of insight. Thank you!
WHERE user__id = 1 needs INDEX(user__id) on the table.
( SELECT ... ORDER BY ... ) UNION ( SELECT ... ORDER BY ... ) will not have a particular order; do you need to move the ORDER BY outside?
tx and tx_external need id to be indexed (PRIMARY KEY?)
Did you really want UNION DISTINCT? That's the default. UNION ALL is faster.
Fix those, then see if you still need to discuss triggers, etc.

How can I delete records in a table so there is one reading a minute?

I have a table in my mysql database that stores gps readings. It stores readings for different users taken at different times:
It should contain one reading per minute per user. However, due to a bug in the app that collects the reading, sometimes it has one reading per second per user, which is way too much data for our requirements.
Is there a way I can delete rows from the table using a query such that each user does not have more than one reading per minute? I want to avoid having to write a program to do this if possible!
Thanks!
First note that deleting a lot of records from a table can be highly non-performant. Often, it is better to simply recreate the table (or a new table):
create table new_gps as
select gps.*
from gps join
(select userid, min(dt) as mindt
from gps
group by userid, floor(time_to_sec(dt) / 60)
) gpsmin
on gps.userid = gpsmin.userid and gps.dt = gpsmin.mindt;
You can use the same idea for the delete:
delete gsp
from gps left join
(select userid, min(dt) as mindt
from gps
group by userid, floor(time_to_sec(dt) / 60)
) gpsmin
on gps.userid = gpsmin.userid and gps.dt = gpsmin.mindt
where gspmin is null;
a easyer way is to set a unique key on the dt column with IGNORE Option, so mysql earse all duplicate lines an prevent it for new duplications.
The only thing to do is:
ALTER IGNORE TABLE gsp ADD UNIQUE KEY (dt);

Best way to store a lot of data with timestamp in MySQL

what I should do?
Imagine tennis match.
Operator pushing buttons (actions) "Ace", "Fault", "Winner", "Unforced error" etc
We have a lot of operators, matches at the same time. And we have a lot of requests to db from users (~1000 per min).
What is the best way to store match_id, player, action, time_of_action?
1) table with 1 row for every match: match_id, actions. Actions, players,timestamp coded into 1 string #of player TINYINT id of action CHAR timestamp TIMESTAMP
example: actions = "1A2014-11-28 09:01:21 2W2014-11-28 09:01:33 1F2014-11-28 09:01:49"
2) table with multiple rows for one match: id, match_id, player, action_id, current timestamp (id PRIMARY KEY)
its will be about 250K rows after one day (300 per match * 40 matches in 1 tournament * 20 tournaments per day)
what is better: a lot of rows and a lot of requests SELECT player, action_id, timestamp FROM scores WHERE match_id = N
or
same number of requests, less rows ( /300 ) but much bigger data in rows?
sry for my ugly language, I hope you understand me, if not, tell me
add:
Im going to use it for match statistics on live or after match.
Users open page Statistics of match Federer - Nadal and every 10-30 seconds page refreshing
Example: http://www.wimbledon.com/en_GB/slamtracker/slamtracker.html?ts=1419259452680&ref=www.wimbledon.com/en_GB/slamtracker/index.html&syn=none&
I suggest you create reference tables called
match match_id, name, venue A row for each distinct match
player player_id, name A row for each distinct player
action action_id, name This is a codelist 1=Ace 2=Fault, etc.
These tables will be relatively static.
Then, I suggest you create an event table containing the following items in the following order.
match_id
ts (TIMESTAMP)
action_id
player_id
You should include all four of these columns in a composite primary key, in the order I have shown them.
Every time your scorers record an action you'll insert a new row to this table.
When you want to display the actions for a particular match, you can do this:
SELECT event.ts,
action.name AS action,
player.name AS player
FROM event
JOIN player ON event.player_id = player.player_id
JOIN action ON event.action_id = action.action_id
WHERE event.match_id = <<whatever match ID>>
ORDER BY event.match_id, event.ts
Because of the order of columns in the composite primary key on the event table, this kind of query will be very efficient even when you're inserting lots of new rows to that table.
MySQL is made for this kind of application. Still, when your site begins to receive tons of user traffic, you probably should arrange to run these queries just once every few seconds, cache the results, and use the cached results to send information to your users.
If you want to retrieve the match IDs for all the matches presently active (that is, with an event within the last ten minutes) you can do this.
SELECT DISTINCT match.id, match.name, match.venue
FROM event
JOIN match on event.match_id = match.match_id
WHERE event.ts >= NOW() - INTERVAL 10 MINUTE
If you need to do this sort of query a lot, I suggest you create an extra index on (ts, match_id).

MySQL multiple tables relationship (code opinion)

I have 4 tables: rooms(id, name, description), clients(id, name, email), cards(id, card_number, exp_date, client_id) and orders(id, client_id, room_id, card_id, start_date, end_date).
The tables are all InnoDB and are pretty much simple. What I need is to add relationships between them. What I did was to assign cards.client_id as a Foreign Key to db.clients and orders.client_id, orders.room_id and orders.card_id as Foreign Keys to the other tables.
My question: is this way correct and reliable? I never had the need to use Foreign Key before now and this is my first try. All the Foreign Keys are also indexes.
Also, what's the easiest way to retrieve all the information I need for db.orders ?
I need a query to output: who is the client, what's his card details, what room/s did he ordered and what's the period he's checked in.
Can I accomplish this query based on the structure I created?
You must create the FK's in all columns that relate to other tables. In your case, create on: cards.client_id, orders.client_id, orders.room_id, orders.card_id
In the case of MySQL it automatically creates indexes for these FK's.
On your select, I believe it can be the following:
SELECT * FROM orders
INNER JOIN client on client.id = orders.client_id
INNER JOIN cards on cards.client_id = client.id
INNER JOIN rooms on rooms.id = orders.room_id
I do not know what columns you need, there is only you replace the * by the columns you need, so SQL is faster.