I have a bucket of 50k records in production. I am supposed to add a new attribute for all the documents. For that, I am executing below queries through web console query workbench.
select count(*) from my-bucket where orderType is missing; --50k records
update my-bucket set orderType = “MY_ORDER” where orderType is missing; –- mutation = 49950
Issue 1: Couchbase is not selecting all my documents for mutations.
Issue 2: Post update, when i again try to look for the number of
documents for which the new attribute is missing, the count keeps on
increasing.
select count( * ) from my-bucket where orderType is missing; --100 records
select count( * ) from my-bucket where orderType is missing; --200 records
select count( * ) from my-bucket where orderType is missing; --350 records
Can someone please explain the reasons as well as the solution to this problem. We are running these queries in a live production environment.
Couchbase server version: Community edition 5.1
Try select count(meta().id) from my_bucket where orderType is missing. Apparently count(*) doesn’t work correctly with where clause in couchbase
Related
I have a table that logs downloads by IP, version and platform. Looking at the table manually I see a lot of duplicates where all 3 of those values are the same. (user is probably just impatient) I'd like to use a SELECT statement that filters out the duplicates and only returns one of the entries if all 3 of those values are the same. Even more advanced, if possible, I also have a date/time field that uses CURRENT_TIMESTAMP. Would be nice if I could include duplicates if they are from different days, but not different times. So I can see if the same user is downloading again on a different day.
I'm mainly just trying to get statistics on how many unique people download each version each day. The structure of the DB table is simple...
key (AUTO_INCREMENT), date (CURRENT_TIMESTAMP), ip, user_agent, platform, version
The software has a Windows and Mac version (platform) and I offer both the current version and a few distinct past versions that were before major changes.
Just group by the fields you want to exclude from being duplicated, like
SELECT ip, platform, version, COUNT(*) AS number_of_tries, max(download_date) AS last_download_date
FROM downloads
GROUP BY ip, platform, version, DATE(download_date)
It would then be relatively easy to do some more advanced filtering over the result grouping by day, etc.
mysql 8.0+ version you can use row_number()
select * from (select *,
row_number()over(partition by ip,platform,date(datetime) order by datetime) rn
from table_name
) a where a.rn=1
Is this what you want? It returns the first record on each date for the ip/platform/version combination:
select t.*
from <tablename> t
where t.datetime = (select min(t2.datetime)
from <tablename> t2
where t2.ip = t.ip and
t2.platform = t.platform and
t2.version = t.version and
date(t2.datetime) = date(t.datetime)
);
I'm using knowage software for data analysis, I'm facing performance issues, now I'm watching 'dataset audit' log to see what queries does the system perform. I found this one that, to me, is a nonsense:
SELECT COUNT(*)
FROM
(select TOP(100) PERCENT "ATC_1" AS "ATC_1"
from
(SELECT [ID_AFo]
,[ATC]
,[ATC_1]
,[ATC_3]
,[ATC_4]
,[ATC_5]
FROM [AFO]
) T order by "ATC_1" ASC
) u
inner T query is the dataset definition query I entered that basically is a select * from [AFO] on my table, outer wrap are made by knowage (I never wrote them)
doesn't a select count (*) from T have performed the same calculation but avoiding a cexpensive order by?
EDIT:
Backend (data source) is MSSQL, cache server is MYSQL so frequent queries are on mysql
This query is equivalent to:
SELECT COUNT(*)
FROM [AFO];
The only reason that I can think of for constructing such a query is if the "100" could be set to another value. I'm not sure if SQL Server's optimizer is good enough to eliminate the ORDER BY in the subquery.
I fail at mysql, and could really do with some help. I don't know what it would be called, and all my attempts at using combinations of DISTINCT and GROUP BY are just not working out.
I have a table of server monitoring data with these columns:
nStatusNumber
Bandwidth
Load
Users
ServerNumber
DiskFree
MemFree
TimeStamp
**nStatusNumber** - A unique number increasing for each entry
**ServerNumber** - A unique number for each server
For the top of my dashboard for this, I need to display the most recent report for each unique server.
// How many servers are we monitoring ?
$nNumServers = mysql_numrows(mysql_query("SELECT DISTINCT(ServerNumber) FROM server_status;"));
// Get our list of servers
$strQuery = "SELECT * FROM server_status ORDER BY nStatusNumber DESC limit ".$nNumServers.";";
And then loop through the results until we hit $nNumServers . This worked at first, until servers started going down/up and the report order got jumbled.
Say theres 20 servers, the most recent 20 results aren't necessarily 1 from each server.
I'm trying to figure this out in a hurry, and failing at it. I've tried all sorts of combinations of DISTINCT and GROUP BY with no luck so far and would appreciate any guidance on what's probably an embarrassingly easy problem that I just can't see the answer to.
Thanks!
PS - Here's an example query that I've been trying, showing the problem I'm having. Check the "nStatusNumber" field, these should be showing the most recent results only for each server - http://pastebin.com/raw.php?i=ngXLRhd6
PPS - Setting max(nStatusNumber) doesn't give accurate results. I don't want some average/sum/median figure, I need the most recent ACTUAL figures reported by each server. Heres more example results for the queries:
http://pastebin.com/raw.php?i=eyuPD7vj
For your purpose you need to find the row unique to a nServerNumber and TimeStamp. This is not as simple as just saying MAX(TimeStamp) as you need to find the row corresponding to it.
Although I am not an expert in SQL you can try this and see if it works.
SELECT A.nServerNumber, A.nStatusNumber, A.nVNStatsBandwidth, A.fLoad, A.nUsers,
A.nUsersPaid, A.nServerNumber, A.nFreeDisk, A.nTotalDisk, A.nFreeMemory,
A.nTotalMemory, A.TimeStamp
FROM server_status A
INNER JOIN
(
SELECT nServerNumber, MAX(TimeStamp) as `TimeStamp`
FROM server_status
GROUP BY nServerNumber
) B
ON A.nServerNumber = B.nServerNumber
AND A.TimeStamp = B.TimeStamp
ORDER BY A.nServerNumber ASC;
This query will give you all the servers with their latest info. So if you want the total number of servers just run the mysql_numrows(...) function on this result and if you want the data just iterate through the same result (no need to fire two separate SQL queries).
Try this ::
Select
Select MAX(nStatusNumber) from table,
Bandwidth,
Load,
Users,
ServerNumber,
DiskFree,
MemFree,
MAX(`TimeStamp`)
from your table
group by ServerNumber
I think I'm having an issue with my MySQL server or the Query I'm using. I'm not sure which.
Server is VM Ubuntu12.4 4 cores/16gb Ram
MySQL 5.5.24 x86
My query:
INSERT INTO `NEWTEXT`.`Order_LineDetails`
( OrderLineItem_ID, Customer_ID, Order_ID, ProductName )
SELECT
'Order_Details'.'OrderDetailID',
'Orders'.'CustomerID',
'Order_Details'.'OrderID',
'prods'.'ProductName'
FROM Order_Details
JOIN Orders ON Orders.OrderID = Order_Details.OrderID
JOIN Products prods ON prods.ProductID = Order_Details.ProductID
WHERE Orders.OrderID = 500000
I'm not really sure where to start looking for the problem. The above query takes 9+ seconds to complete. The Order_Details table contains 1,800,000+ records in it.
The thing that is bugging me on this is that when I run a select query it also goes slow. BUT, I have another server that's running win2k MsSql and its almost instant with the same SELECT query.
I'm hoping someone could point me in the right direction here.
EDIT
Well, sorry for the troubles and thanks for your help.
I found that the problem was that after I finished the import I skipped the step where I would normally assign the new tables a PrimaryKey. I know, :( dumb.
Anyway! Don't forget to assign your Primary Keys!
Back to the start:
http://dev.mysql.com/doc/refman/5.5/en/optimizing-primary-keys.html
I'm trying to execute the following query
SELECT * FROM person
WHERE id IN
( SELECT user_id FROM participation
WHERE activity_id = '1' AND application_id = '1'
)
The outer query returns about 4000 responses whilst the inner returns 29. When executed on my web server nothing happened and when I tested it locally mysql ended up using 100% CPU and still achieved nothing. Could the size be the cause?
Specifically it causes the server to hang forever, I'm fairly sure the web server I ran the query on is in the process of crashing due to it (woops).
why don't you use an inner join for this query? i think that would be faster (and easier to read) - and maybe it solves your problem (but i can't find a failure in your query).
EDIT: the inner-join-solution would look like this:
SELECT
person.*
FROM
person
INNER JOIN
participation
ON
person.id = participation.user_id
WHERE
participation.activity_id = '1'
AND
participation.application_id = '1'
How many rows are there in participation table and what indexes are there?
A multi-column index on (user_id, activity_id, application_id) could help here.
Re comments: IN isn't slow. Subqueries within IN can be slow, if they're correlated to outer query.