Couchbase deleted documents reappearing in database

Couchbase deleted documents reappearing in database - couchbase

We are experiencing a problem where deleted documents are reappearing on our Couchbase server.
We have a scenario where documents are created on CBL. These documents are synced up to the server. The user realizes an error has been made and flags the document as incorrect. On the server the admin can then view all of the flagged documents and delete them from the server. The sync gateway has been setup to only sync up these types of documents, i.e. once an edit has been made to these documents on the server the changes are not synced back down to CBL.
Here is the process of what is happening:
Document is created on CBL with a TTL of 15 days and synced to sync
Document is updated on CBL and synced to sync gateway.
Document is deleted from the Couchbase Server bucket with a DELETE
N1QL query.
After the document is deleted from the bucket it gets
randomly added again within a few days.
Only documents that are
still on the devices i.e. not older than TTL of 15 days, are added
back to the bucket.
We tried increasing the Metadata Purge Interval to more than 15 days but this did not resolve the problem.
Does anybody have any suggestions or possibly know what could be the problem here?
Couchbase Server Community Edition 6.5.1 build 6299
Sync gateway 2.7.3
Couchbase Lite Android 2.8.1
Thanks in advance!
PS: Here is our Sync Gateway config with the sync function:
"log": [
"*"
],
"adminInterface": "0.0.0.0:4985",
"interface": "0.0.0.0:4984",
"databases": {
"prod": {
"server": "http://localhost:8091",
"bucket": "prod_bukcet",
"username": "sync_gateway",
"password": "XXX",
"enable_shared_bucket_access": true,
"import_docs": "continuous",
"use_views": true,
"users": {
"user_X": {
"password": "XXX",
"admin_channels": ["*"],
"disabled": false
}
},
"sync":`
function sync(doc, oldDoc) {
/* sanity check */
// check if document was removed from server or via SDK
// In this case, just return
if (isRemoved()) {
return;
}
//Only sync down documents that are created on the server
if (doc.deviceDoc == true) {
channel("server");
} else {
if (doc.siteId) {
channel(doc.siteId);
} else {
channel("devices");
}
}
// This is when document is removed via SDK or directly on server
function isRemoved() {
return (isDelete() && oldDoc == null);
}
function isDelete() {
return (doc._deleted == true);
}
}`,
}
}
}

In shared bucket access mode (enable_shared_bucket_access:true),a N1QL delete on a document creates a tombstone. Tombstones are always synced. The metadata purge interval setting on the server determines the period after which the tombstone gets purged on the server. So it is typical to set it to a value that matches the maximum partition window of the client- that is to ensure that all disconnected clients have the opportunity to get the deleted document. So setting it to > 15 days just means that the tombstone will be purged after 15 days and so tombstoned documents will be synced down to clients in the meantime.
In your case, if you don't want documents to be synced down to clients because the lifetime of the document is managed independently on CBL side via the expirationDate(), then purge the document instead of deleting it on server.

Related

Couchbase Document with abnormal metadata

We're running Couchbase Server with Sync Gateway in our project. When I query Couchbase, I encounter documents with Metadata below. It has fields flags and expiration equal to zero.
The problem is, this document is not in the DB yesterday. Is it normal to have a document with metadata like below? Document body has a lastupdate field that points to a date last year. It's possible that we deleted it before, but it reappears now.
When I check couchbase and Sync Gateway logs, I didn't encounter any logs belonging to its id.
{
"flags": 0,
"expiration": 0,
"id": "SomeIDXXX::YYY",
"cas": 1669165471522357248,
"type": "json"
}

Growing number of couchbase binary documents in XDCR destination bucket

I am running couchbase enterprise edition version 6.6.2 on Windows server 2016 standard edition.
I have two buckets called A and B. Bucket A is configured to run with enable_shared_bucket_access = true, my sync gateway creates new documents in bucket A, a bunch of services change and delete these documents.
XDCR replicates documents from bucket A to bucket B. All changes to documents in bucket A are replicated to bucket B, except deletions in bucket A are not replicated to bucket B. When documents in bucket B get older than 62 days they get deleted by an external service.
Over time I noticed that 93% of the documents in bucket B are binary documents! My own documents are in JSON, I don’t use any kind of binary documents in my solution. This leads me to the conclusion that these binary documents are some internal couchbase documents.
Here is a example of these binary documents
{
"$1": {
"cas": 1667520921496387584,
"expiration": 0,
"flags": 50331648,
"id": "_sync:rev:00001abd-1f99-4b4e-a695-d11574ea9ed8:0:",
"type": "base64"
},
"pa": "<binary (1 b)>"
},
{
"$1": {
"cas": 1667484959445614592,
"expiration": 0,
"flags": 50331648,
"id": "_sync:rev:00001abd-1f99-4b4e-a695-d11574ea9ed8:34:2-d3fb2d58672f853d98ce343d3ae84c1d",
"type": "base64"
},
"pa": "<binary (1129 b)>"
}
My issue with these documents is that they increase dramatically over time! and they don’t get cleaned up automatically! So they just grow and consume resources!
What are these documents used for?
Why aren’t these documents cleaned automatically?
Is it safe to simply delete these documents?
Is this a bug or a feature? :-)
Regards,
Siraf

The issue was solved by adding this AND NOT REGEXP_CONTAINS(META().id,"^_sync:rev") to the XDCR replication filter expression. This stopped binary documents do be replicated from bucket A to B.

Google Drive Rest API - How to check if file has changed

Is there a reliable way, short of comparing full contents, of checking if a file was updated/change in Drive?
I have been struggling with this for a bit. Here's the two things I have tried:
1. File version number
I upload a plain text file to Google Drive (simple upload, update endpoint), and save the version from the file metadata returned after a successful upload.
Then I poll the Drive API (get endpoint) occasionally to check if the version has changed.
The trouble is that within a second or two of uploading the file, the version gets bumped up again.
There are no changes to the file content. The file has not been opened, viewed, or even downloaded anywhere else. Still, the version number increases from what it was after the upload.
To my code this version number change indicates that the remote file has been changed in Drive, so it downloads the new version. Every time!
2. The Changes endpoints
As an alternative I tried using the Changes api.
After I upload the file, I get a page token using changes.getStartPageToken or changes.list.
Later I use this page token to poll the Changes API for changes, and filter the changes for the fileId of uploaded file. I use these options when polling for changes:
{
"includeRemoved": false
"restrictToMyDrive": true
"spaces": "drive"
}
Here again, there is the same problem as with the version number. The page token returned immediately after uploading the file changes again within a second or two. The new page token shows the uploaded file having been changed.
Again, there is no change to the content of the file. It hasn't been opened, updated, downloaded anywhere else. It isn't shared with anyone else.
Yet, a few seconds after uploading, the file reappears in the changes list.
As a result, the local code redownloads the file from Drive, assuming remote changes.
Possible workaround
As a hacky hook, I could wait a few seconds after the file upload before getting the new file-version/changes-page-token. This may take care of the delayed version increment issue.
However, there is no documentation of what is causing this phantom change in version number (or changes.list). So, I have no sure way of knowing:
How long a wait is safe enough to get a 'settled' version number without losing possible changes by other users/apps?
Whether the new (delayed) version number will be stable, or may change again at any time for no reason?
Is there a reliable way, short of comparing full contents, of checking if a file was updated/change in Drive?

You can try using the md5Checksum property of the File resource object, if your file is not a Google Doc file (ie. binary). You should be able to use that to track changes to the contents of your binary files.
You might also be able to use the Revisions API.
The Revisions resource object also has a md5Checksum property.

As a workaround, how about using Drive Activity API? I think that there are several answers for your situation. So please think of this as just one of them.
When Drive Activity API is used, the activity information about the target file can be retrieved. For example, from ActionDetail, you can see whether the target file was edited, renamed, deleted and so on.
The sample endpoint and request body are as follows.
Endpoint:
POST https://driveactivity.googleapis.com/v2/activity:query?fields=activities%2CnextPageToken
Request body:
{"itemName": "items/### fileId of target file ###"}
Response:
Sample response is as follows. You can see the information from this. The file with the fileId and filename was edited at the timestamp.
{
"activities": [
{
"primaryActionDetail": {
"edit": {} <--- If the target file was edited, this property is added.
},
"actors": [
{
"user": {
"knownUser": {
"personName": "people/### userId who edited the target file ###",
"isCurrentUser": true
}
}
}
],
"actions": [
{
"detail": {
"edit": {}
}
}
],
"targets": [
{
"driveItem": {
"name": "items/### fileId of target file ###",
"title": "### filename of target file ###",
"file": {},
"mimeType": "### mimeType of target file ###",
"owner": {
"user": {
"knownUser": {
"personName": "people/### owner's userId ###",
"isCurrentUser": true
}
}
}
}
}
],
"timestamp": "2000-01-01T00:00:0.000Z"
},
],
"nextPageToken": "###"
}
Note:
When you use this API in my environment, please enable Drive Activity API at API console and include https://www.googleapis.com/auth/drive.activity.readonly in the scopes.
Although when I used this API, I felt that the response was fast, if the response was slow when you use this, I apologize.
References:
Google Drive Activity API
ActionDetail
If this was not what you want, I apologize.

What you are seeing is the eventual consistency feature of the Google Drive filesystem. If you think about search, it doesn't matter how quickly a search index is updated, only that it is eventually updated and is very efficient for reading. Google Drive works on the same premise.
Drive acknowledges your updates as quickly as possible. Long before those updates have propagated to all worldwide copies of your file. Derived data (eg. timestamps and I think I recall, md5sums) are also calculated after the update has "completed".
The solution largely depends on how problematic the redundant syncs are to your app.
The delay of a few seconds is enough to deal with the vast majority of phantom updates.
You could switch to the v2 API and use etags.
You could implement your own version number using custom properties. So every time you sync up, you increment your own version number. You only sync down if the application version number has changed.

sensu client subscriptions non-responding

I have setup sensu-server and client successfully and all is working except one thing . in this image
you can see that there are alerts for mysql and web ports.but I have given only "mysql" subscription right now in my client.json file in my client system. I have removed the "webserver" subscription from client.json (which I added initially before replacing it with "mysql" ) but still the checks associated with the "webserver" subscription are displayed. why is this? and how to display only the checks associated with the given subcription. here is my client.json
{
"client": {
"name": "sensuclient2",
"address": "127.0.0.1",
"keepalive": {
"thresholds": {
"warning": 60,
"critical": 120
},
"handlers": ["default", "mailer", "sns"]
},
"subscriptions": [
"mysql"
]
}
}

It's possible Uchiwa is showing older checks, prior to the change you made to your client configuration file (at least I went through that once!). Try deleting the events. If the API is not running the checks anymore, the events won't come up again.
You can either use sensu-cli to delete the events:
sensu-cli event delete sensuclient2 check_http
https://github.com/agent462/sensu-cli
Or make an API call...
curl -s -i -X DELETE http://yourhost:yourport/events/sensuclient2/check_http
https://sensuapp.org/docs/1.1/api/events-api.html#eventsclientcheck-delete
If the checks do come back you should check both server and and client side checks and client configuration.
Also, the simplest is the best, #vishal.k himself reminded me:
you can always delete the events using Uchiwa's interface. :)

Consul 0.8 ACL migration - how to migrate

TLTR
How to migrate the pre 0.8 ACL permissions to 0.7.3?
Current setup
I am currently running an ACL enabled Consul 0.7.3 stack.
With Consul 0.8 ACLs will finally also include services and nodes, so that nodes / service (Consul) are not longer shown to anonymous users. This is exactly what I need. Today I tried to enable the new ACL "pre 0.8" using https://www.consul.io/docs/agent/options.html#acl_enforce_version_8
After doing so, my nodes could no longer authenticate against the master ( if authentication is the problem at all ).
I run the consul-network with gossip enabled, I have configured a acl_master_token:
"{acl_master_token":"<token>}"
and a token for the agents:
"{acl_token":"<token>}"
which all agents use / are configured with.
I have these ACL defaults:
{
"acl_datacenter": "stable",
"acl_default_policy": "deny",
"acl_down_policy": "deny"
}
and my Consul config looks like this:
{
"datacenter": "stable",
"data_dir": "/consul/data",
"ui": true,
"dns_config": {
"allow_stale": false
},
"log_level": "INFO",
"node_name": "dwconsul",
"client_addr" : "0.0.0.0",
"server": true,
"bootstrap": true,
"acl_enforce_version_8": true
}
What happens
When I boot, I cannot see my nodes/services using my token at all, neither the nodes/agents can register at the master,
Question
What is exactly needed to get the following:
All agents can see all nodes and all services and all KVs
Anonymous sees nothing, not KV, services or nodes (thats what is possible with 0.8 )
I looked at https://www.consul.io/docs/internals/acl.html "ACL Changes Coming in Consul 0.8" but I could not wrap my head around it. Should I now use https://www.consul.io/docs/agent/options.html#acl_agent_master_token instead of acl_token?
Thank you for any help. I guess I will not be the only one on this migration path and this particular interest, a lot of people are interested in this. You help all of them :)

It looks like the new node policy is preventing the nodes from registering properly. This should fix things:
On your Consul servers configure them with an acl_agent_token that has a policy that can write to any node, like this: node "" { policy = "write" }.
On your Consul agents, configure them with a similar one to the servers to keep things open, or you can give them a token with a more specific policy that only lets them write to some allowed prefix.
Note this gets set as the acl_agent_token which is used for internal registration operations. The acl_agent_master_token is used as kind of an emergency token to use the /v1/agent APIs if there's something wrong with the Consul servers, but it only applies to the /v1/agent APIs.
For "all agents can see all nodes and all services and all KVs" you'd add node read privileges to whatever token you are giving to your agents via the acl_token, so you'd add a policy like:
node "" { policy = "read" }
service "" { policy = "read" }
key "" { policy = "read" }
Note that this allows anyone with access to the agent's client interface to read all these things, so you want to be careful with what you bind to (usually only loopback). Or don't set acl_token at all and make callers pass in a token with each request.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008