Combining mysql queries of unrelated data when related point is a substring() - mysql

I'm having a hard time wrapping my head around how to combine these queries together.
Here is my db setup:
Table 1:
querylog - log of all api calls application makes
- id (AI)
- url (VARCHAR)
- when (DATETIME)
Table 2:
trades - data returned from api calls
- tid (trade ID, unique)
- price
- date (datetime) - when trade occured, not when inserted
- etc
I am trying to get a count of records added in the last hour.
I can use this sql statement to get the first trade TID added in the last hour (pre-modified url is in the form: https://API_INFO_HERE/trades?id=TID_HERE)
SELECT SUBSTRING(url,50, 50) as oldest from querylog where url like 'https://API_INFO_HERE/trades?%' and `when`>= DATE_SUB(NOW(),INTERVAL 1 HOUR) ORDER BY `querylog`.`when` DESC LIMIT 1
Then to get the count all i need is:
SELECT count(*) FROM `trades` where tid > VALUE_FROM_PREVIOUS_QUERY
If anyone could help me combine the queries I would be very appreciative!

I think you should do something like this:
SELECT count(*) FROM `trades`
WHERE tid > (SELECT SUBSTRING(url,50, 50) as oldest from querylog where url like 'https://API_INFO_HERE/trades?%' and `when`>= DATE_SUB(NOW(),INTERVAL 1 HOUR) ORDER BY `querylog`.`when` DESC LIMIT 1)
Keep in mind though, that it may be slow, if there's no index on substring(url,50,50).
You should probably consider logging request time in trades table as well or in contrary add appropriate index on querylog, to speed things up.

Related

Returning rows where there is no reconnect timestamp

I'm new to SQL, so please forgive me for what seems like a basic question.
I have a database called 'timestamps' that is structured like this;
ID
Connected_Unixtime
Disconnected_Unixtime
3
1658260585
1658260645
1
1658260465
1658260525
2
1658260345
1658260405
1
1658260225
1658260285
I'm trying to write some SQL that returns rows of IDs that disconnected and have had more than one hour since the last connect. So if there's an ID that has a Disconnected_Unixtime but no Connected_Unixtime after one hour of disconnecting, return the most recent row.
Basically the results should look like this, with just IDs that didn't have a new connection after an hour:
ID
Connected_Unixtime
Disconnected_Unixtime
3
1658260585
1658260645
2
1658260345
1658260405
I tried to use the not exist operator since a new row is created every time an ID connects. This is what I came up with, but it's not correct.
select *
from timestamps
where not exists (select *
from timestamps
where (Disconnected_Unixtime) > now() - interval 1 hour
order by Disconnected_Unixtime desc
)
Thanks for the help.

How to speed up query for datetime in Mysql

SELECT *
FROM LOGS
WHERE datetime > DATE_SUB(NOW(), INTERVAL 1 MONTH)
I have a big table LOGS (InnoDB). When I try to get last month's data, the query waits too long.
I created an index for column datetime but it seems not helping. How to speed up this query?
Since the database records are inserted in oldest to newest, you could create 2 calls. The first call requesting the ID of the oldest record:
int oldestRecordID = SELECT TOP 1 MIN(id)
FROM LOGS
WHERE datetime > DATE_SUB(NOW(), INTERVAL 1 MONTH)
Then with that ID just request all records where ID > oldestRecordID:
SELECT *
FROM LOGS
WHERE ID > oldestRecordID
It's multiple calls, but it could be faster however I am sure you could combine those 2 calls too.
Probably the only thing you can do is create a clustered index on datetime. This will ensure that the values are co-located.
However, I don't think this will solve your real problem. Why are you bringing back all records from a month. This is a lot of data.
In all likelihood, you could summarize the data in the database and only bring back the information you need rather than all the data.

MySQL - group by interval query optimisation

Some background first. We have a MySQL database with a "live currency" table. We use an API to pull the latest currency values for different currencies, every 5 seconds. The table currently has over 8 million rows.
Structure of the table is as follows:
id (INT 11 PK)
currency (VARCHAR 8)
value (DECIMAL
timestamp (TIMESTAMP)
Now we are trying to use this table to plot the data on a graph. We are going to have various different graphs, e.g: Live, Hourly, Daily, Weekly, Monthly.
I'm having a bit of trouble with the query. Using the Weekly graph as an example, I want to output data from the last 7 days, in 15 minute intervals. So here is how I have attempted it:
SELECT *
FROM currency_data
WHERE ((currency = 'GBP')) AND (timestamp > '2017-09-20 12:29:09')
GROUP BY UNIX_TIMESTAMP(timestamp) DIV (15 * 60)
ORDER BY id DESC
This outputs the data I want, but the query is extremely slow. I have a feeling the GROUP BY clause is the cause.
Also BTW I have switched off the sql mode 'ONLY_FULL_GROUP_BY' as it was forcing me to group by id as well, which was returning incorrect results.
Does anyone know of a better way of doing this query which will reduce the time taken to run the query?
You may want to create summary tables for each of the graphs you want to do.
If your data really is coming every 5 seconds, you can attempt something like:
SELECT *
FROM currency_data cd
WHERE currency = 'GBP' AND
timestamp > '2017-09-20 12:29:09' AND
UNIX_TIMESTAMP(timestamp) MOD (15 * 60) BETWEEN 0 AND 4
ORDER BY id DESC;
For both this query and your original query, you want an index on currency_data(currency, timestamp, id).

Getting last 30 days of records

I have a table called 'Articles' in that table I have 2 columns that will be essential in creating the query I want to create. The first column is the dateStamp column which is a datetime type column. The second column is the Counter column which is an int(255) column. The Counter column technically holds the views for that particular field.
I am trying to create a query that will generate the last 30 days of records. It will then order the records based on most viewed. This query will only pick up 10 records. The current query I have is this:
SELECT *
FROM Articles
WHERE DATEDIFF(day, dateStamp, getdate()) BETWEEN 0 and 30
LIMIT 10
) TOP10
ORDER BY Counter DESC
This query is not displaying any records, but I don't understand what I am doing wrong. Any suggestions?
The MySQL version of the query would look like this:
SELECT a.*
FROM Articles a
WHERE a.dateStamp >= CURDATE() - interval 30 day
ORDER BY a.counter DESC
LIMIT 10;
Your query is generating an error. You should look at that error before fixing the query.
The query would look different in SQL Server.

How to select the field's increment from mysql

I have a table recording the accumulative total visit numbers of some web pages every day. I want to fetch the real visit numbers in a specific day for all these pages. the table is like
- record_id page_id date addup_number
- 1 1 2012-9-20 2110
- 2 2 2012-9-20 1160
- ... ... ... ...
- n 1 2012-9-21 2543
- n+1 2 2012-9-21 1784
the result I'd like to fetch is like:
- page_id date increment_num(the real visit numbers on this date)
- 1 2012-9-21 X
- 2 2012-9-21 X
- ... ... ...
- N 2012-9-21 X
but I don't want to do this in php, cause it's time consuming. Can I get what I want with SQL directives or with some mysql functions?
Ok. You need to join the table on itself by joining on the date column and adding a day to one side of the join.
Assuming:
date column is a legitimate DATE Type and not a string
Every day is accounted for each page (no gaps)
addup_number is an INT of some type (BIGINT, INT, SMALLINT, etc...)
table_name is substituted for your actual table name which you don't indicate
Only one record per day for each page... i.e. no pages have multiple counts on the same day
You can do this:
SELECT t2.page_id, t2.date, t2.addup_number - t1.addup_number AS increment_num
FROM table_name t1
JOIN table_name t2 ON t1.date + INTERVAL 1 DAY = t2.date
WHERE t1.page_id = t2.page_id
One thing to note is if this is a huge table and date is an indexed column, you'll suffer on the join by having to transform it by adding a day in the ON clause, but you'll get your data.
UPDATED:
SELECT today.page_id, today.date, (today.addup_number - yesterday.addup_number) as increment
FROM myvisits_table today, myvisits_table yesterday
WHERE today.page_id = yesterday.page_id
AND today.date='2012-9-21'
AND yesterday.date='2012-9-20'
GROUP BY today.page_id, today.date, yesterday.page_id, yesterday.date
ORDER BY page_id
Something like this:
SELECT date, SUM(addup_number)
FROM your_table
GROUP BY date