How to delete documents with no subdocuments in couchbase? - couchbase

I have a couchbase document with id x
x has no subdocument , As all were deleted in some subdoc operation , it is something like this {}
I want to delete all such empty docs with no subdocs. Is it possible in couchbase ,using an N1QL query or otherwise? I tried googling for solutions, but I found no relevant solutions.
Thanks for taking the time to read the question till the end.

The following query requires primary index.
DELETE FROM default AS d
WHERE d = {};
The following query uses ix10 as covered index. Index contain only empty objects.
CREATE INDEX ix10 ON default(OBJECT_LENGTH(self))
WHERE OBJECT_LENGTH(self) = 0;
DELETE FROM default AS d
WHERE OBJECT_LENGTH(d) = 0;
You can verify with the following data. The select should only give "k003"
INSERT INTO default VALUES ("k001",1), VALUES ("k002",{"a":1}), VALUES ("k003",{});
SELECT META(d).id
FROM default AS d
WHERE OBJECT_LENGTH(d) = 0;

Related

Postgresql test JSON and delete

I have a PGSQL database with a table that contains column containingJSON data along the lines of
{"kind":2,"msgid":102}
{"kind":99,"pid":"39s-8KeH306vhjzNta3Yrg,,","msgid":101}
...
Is it possible to write and execute DELETE statement along the lines of
DELETE FROM table WHERE data.kind = '99' AND data.pid = '39s-8KeH306vhjzNta3Yrg,,'?
where data happens to be the name of that particular column. I tried the above and got the error
missing FROM-clause entry for table "data"
i.e. PGSQL is interpreting that as being the table data. Clearly, the require syntax is different. I'd be grateful to anyone who might be able to tell me what to do here.
assuming you have:
t=# with c(j) as (values('{"kind":99,"pid":"39s-8KeH306vhjzNta3Yrg,,","msgid":101}'::json))
select * from c where j->>'kind' = '99' and j->>'pid' = '39s-8KeH306vhjzNta3Yrg,,';
j
----------------------------------------------------------
{"kind":99,"pid":"39s-8KeH306vhjzNta3Yrg,,","msgid":101}
(1 row)
then your statement will be:
delete from table where data->>'kind' = '99' and data->>'pid' = '39s-8KeH306vhjzNta3Yrg,,';
check json operators here: https://www.postgresql.org/docs/current/static/functions-json.html

MySQL: How do I use Load Data Infile and replace existing rows' fields if it is not empty field in the file

I have a file with some empty fields like this:
(first column being primary key- a1,b1,b2)
a1,b,,d,e
b1,c,,,,e
b2,c,c,,
I have already present in table like
a1,c,f,d,e
Now for this key a1 using replace option and lad data infile I want final output like:
a1,b,f,d,e
Here c in second column has been replaced by b,
but f has not been replaced by empty string.
To make it clear: Replace field if an actual value is present in file
if an empty field is present, retain the old value.
Let consider 2 tables having 5 columns present
in t1 table -columns are c1,c2,c3,c4,c5
in t2 table -columns are d1,d2,d3,d4,d5
so query will become like this:
select c1 as e1
ifnull(c2,d2) as e2,
ifnull(c3,d3) as e3,
ifnull(c4,d4) as e4,
ifnull(c5,d5) as e5
from t1
inner join t2 on c1 = d1;
hope it will helpful to you.
Please try the following...
CREATE TABLE tempTblDataIn LIKE tblTable;
/* Read your data into tempTblDataIn here */
UPDATE tblTableName
JOIN tempTblDataIn ON tblTableName.fldID = tempTblDataIn.fldID
SET tblTableName.fldField1 = COALESCE( tempTblDataIn.fldField1, tblTableName.fldField1 ),
tblTableName.fldField2 = COALESCE( tempTblDataIn.fldField2, tblTableName.fldField2 ),
tblTableName.fldField3 = COALESCE( tempTblDataIn.fldField3, tblTableName.fldField3 ),
tblTableName.fldField4 = COALESCE( tempTblDataIn.fldField4, tblTableName.fldField4 );
DROP TABLE tempTblDataIn;
This Answer is based on Eric's Answer at MySQL - UPDATE query based on SELECT Query.
It is also based on the assumption that the data file will contain update data only rather than update data and new records.
Yes, you will need to do a COALESCE() line for each field. You will probably have to code each line yourself. You could use a PROCEDURE if there are many fields with a repeated structure to programmatically produce the above statements, but you may find the above simpler.
If you have any questions or comments, then please feel free to post a Comment accordingly.
Further Reading
https://dev.mysql.com/doc/refman/5.7/en/create-table-like.html (on MySQL's CREATE TABLE ... LIKE)
https://dev.mysql.com/doc/refman/5.7/en/update.html (on MySQL's UPDATE statement)

MySQL INSERT-SELECT a non-mandatory field with JOIN

I have a table (netStream), that has 2 foreign keys: (logSessions_logSessionID) and (accountSessions_accountSessionID).
The (logSessions_logSessionID) is mandatory the (accountSessions_accountSessionID) is NOT mandatory.
Here is the part of the block-scheme that shows the connections and the non-mandatory status:
(The background: logSessions are the sessions that every visitor have, accountSessions are the login sessions. Everybody has a logSession (since everybody is a visitor), but not everybody is logged in, so they do not have accountSession)
I want to insert a row into (netStream), in every case there is a (logSession), but it is not the same with (accountSession). So, when there is an (accountSession), I want to insert that ID too, if there is no (accountSession), then just leave that field in (netStream) NULL.
The hash values are stored in Binary(x), this is why I use UNHEX().
This is the MySQL I wrote, there is no error message, but it does not work. What is the problem?
INSERT INTO `test-db`.`netStream` (`netStreamHash`, `logSessions_logSessionID`, `accountSessions_accountSessionID`)
SELECT UNHEX("1faab"), `logSessions`.`logSessionID`, NULL FROM `logSessions` CROSS JOIN `accountSessions`
WHERE `logSessions`.`logSessionHash` = UNHEX("aac") AND
`accountSessions`.`accountSessionHash` = UNHEX("2fb");
If understand you correctly you are probably looking for something like this
INSERT INTO `test-db`.`netStream` (
`netStreamHash`,
`logSessions_logSessionID`,
`accountSessions_accountSessionID`)
SELECT UNHEX("1faab"),
(SELECT `logSessionID` FROM `logSessions`
WHERE `logSessionHash` = UNHEX("aac")),
(SELECT `accountSessionID` FROM `accountSessions`
WHERE `accountSessionHash` = UNHEX("2fb"));
If there is no matching row in accountSessions then you'll get NULL inserted in accountSessions_accountSessionID in netStream table

Sphinx main/delta indexing, sql_query_killlist

I am currently using Sphinx for indexing a MySQL query with 20+ million records.
I am using a delta index to update the main index and add all new records.
Unfortunately allot of changes to the tables are deleted.
I understand that I can use sql_query_killlist to get all document ID's that need to be deleted or updated. Unfortunately I don't understand how this actually works and the documentation from Sphinx does not have a good enough example for me to understand.
If I use the following example, how could I implement the killlist?
in MySQL
CREATE TABLE sph_counter
(
counter_id INTEGER PRIMARY KEY NOT NULL,
max_doc_id INTEGER NOT NULL
);
in sphinx.conf
source main
{
# ...
sql_query_pre = SET NAMES utf8
sql_query_pre = REPLACE INTO sph_counter SELECT 1, MAX(id) FROM documents
sql_query = SELECT id, title, body FROM documents \
WHERE id<=( SELECT max_doc_id FROM sph_counter WHERE counter_id=1 )
}
source delta : main
{
sql_query_pre = SET NAMES utf8
sql_query = SELECT id, title, body FROM documents \
WHERE id>( SELECT max_doc_id FROM sph_counter WHERE counter_id=1 )
}
index main
{
source = main
path = /path/to/main
# ... all the other settings
}
note how all other settings are copied from main,
but source and path are overridden (they MUST be)
index delta : main
{
source = delta
path = /path/to/delta
}
The specifics depend a lot on how you mark deleted documents. But would just add something like
sql_query_killist = SELECT id FROM documents
WHERE status='deleted'
AND id<=( SELECT max_doc_id FROM sph_counter
WHERE counter_id=1 )
to the delta index. That would capture ids of deleted record that are in the main index, and add them to the killlist so they would never appear in search results.
If want to capture updated records, need to arrange for the new rows to be included in the main sql_query of the delta, AND their ids to be in the kill-list.

How to avoid filesort for that mysql query?

I'm using this kind of queries with different parameters :
EXPLAIN SELECT SQL_NO_CACHE `ilan_genel`.`id` , `ilan_genel`.`durum` , `ilan_genel`.`kategori` , `ilan_genel`.`tip` , `ilan_genel`.`ozellik` , `ilan_genel`.`m2` , `ilan_genel`.`fiyat` , `ilan_genel`.`baslik` , `ilan_genel`.`ilce` , `ilan_genel`.`parabirimi` , `ilan_genel`.`tarih` , `kgsim_mahalleler`.`isim` AS mahalle, `kgsim_ilceler`.`isim` AS ilce, (
SELECT `ilanresimler`.`resimlink`
FROM `ilanresimler`
WHERE `ilanresimler`.`ilanid` = `ilan_genel`.`id`
LIMIT 1
) AS resim
FROM (
`ilan_genel`
)
LEFT JOIN `kgsim_ilceler` ON `kgsim_ilceler`.`id` = `ilan_genel`.`ilce`
LEFT JOIN `kgsim_mahalleler` ON `kgsim_mahalleler`.`id` = `ilan_genel`.`mahalle`
WHERE `ilan_genel`.`ilce` = '703'
AND `ilan_genel`.`durum` = '1'
AND `ilan_genel`.`kategori` = '1'
AND `ilan_genel`.`tip` = '9'
ORDER BY `ilan_genel`.`id` DESC
LIMIT 225 , 15
and this is what i get in explain section:
these are the indexes that i already tried to use:
any help will be deeply appreciated what kind of index will be the best option or should i use another table structure ?
You should first simplify your query to understand your problem better. As it appears your problem is constrained to the ilan_gen1 table, the following query would also show you the same symptoms.:
SELECT * from ilan_gene1 WHERE `ilan_genel`.`ilce` = '703'
AND `ilan_genel`.`durum` = '1'
AND `ilan_genel`.`kategori` = '1'
AND `ilan_genel`.`tip` = '9'
So the first thing to do is check that this is the case. If so, the simpler question is simply why does this query require a file sort on 3661 rows. Now the 'hepsi' index sort order is:
ilce->mahelle->durum->kategori->tip->ozelik
I've written it that way to emphasise that it is first sorted on 'ilce', then 'mahelle', then 'durum', etc. Note that your query does not specify the 'mahelle' value. So the best the index can do is lookup on 'ilce'. Now I don't know the heuristics of your data, but the next logical step in debugging this would be:
SELECT * from ilan_gene1 WHERE `ilan_genel`.`ilce` = '703'`
Does this return 3661 rows?
If so, you should be able to see what is happening. The database is using the hepsi index, to the best of it's ability, getting 3661 rows back then sorting those rows in order to eliminate values according to the other criteria (i.e. 'durum', 'kategori', 'tip').
The key point here is that if data is sorted by A, B, C in that order and B is not specified, then the best logical thing that can be done is: first a look up on A then a filter on the remaining values against C. In this case, that filter is performed via a file sort.
Possible solutions
Supply 'mahelle' (B) in your query.
Add a new index on 'ilan_gene1' that doesn't require 'mahelle', i.e. A->C->D...
Another tip
In case I have misdiagnosed your problem (easy to do when I don't have your system to test against), the important thing here is the approach to solving the problem. In particular, how to break a complicated query into a simpler query that produces the same behaviour, until you get to a very simple SELECT statement that demonstrates the problem. At this point, the answer is usually much clearer.