Update collection , convert fields to array - json

I'm pretty new in nosql world and i'd like to try the"geonear" (geospatial) feature in mongodb , i imported some data in this form :
{
"_id":ObjectId("549164b752c5c30b15bbc26a"),
"ville":"Auenheim",
"lat":"48,81",
"lon":"8,01"
}
and i need to update all my data collection in this form :
{
"_id":ObjectId("549164b752c5c30b15bbc26a"),
"ville":"Auenheim",
loc : { type: "Point", coordinates: [ 8.01 , 48.81] }
}
Is there a way from an update query to do that with mongo ?
or should i use a php script(collection is huge..)
thanks fo help,
happy

You can iterate through each document and change the format with a simple script. In the mongo shell, you would write something like
db.test.find({}, { "lat" : 1, "lon" : 1 }).forEach(function(doc) {
db.test.update({ "_id" : doc._id },
{
"$unset" : { "lat" : 1, "lon" : 1 },
"$set" : { "loc" : { "type" : "Point", "coordinates" : [ doc.lon, doc.lat ] } }
})
})
You need to change your lat and lon to numbers, as well. I'm not sure if that was a typo or what, but you can do that as part of the function to, if need be. To make this faster, you can a parallel collection scan, which is supported in most drivers, to process all the documents using multiple threads.

Related

How to use .indexOn for dynamic keys in firebase?

I have a DB in Firebase with this structure:
{
"chats" : {
"-L-hPbTK51XFwjNPjz3X" : {
"lastMessage" : "Hello!",
"timestamp" : 1512590440336,
"title" : "chat 1",
"users" : {
"Ol0XhKBksFcrYmF4MzS3vbODvT83" : true
}
}
},
"messages" : {
"-L-hPbTK51XFwjNPjz3X" : {
"-L-szWDIKX2SQl4YZFw9" : {
"message" : "Hello!",
"timestamp" : 1512784663447,
"userId" : "Ol0XhKBksFcrYmF4MzS3vbODvT83"
}
}
},
"users" : {
"Ol0XhKBksFcrYmF4MzS3vbODvT83" : {
"chats" : {
"-L-hPbTK51XFwjNPjz3X" : true
},
"email" : "mm#gmail.com",
"name" : "mm"
}
}
}
My code:
Database.database().reference().child("chats")
.queryOrdered(‌​byChild: "users/(userId)").queryEqual(toValue: true).observe(.value, with: { snapshot in .... }
When I try to get chat members or user chats, It shows this warnings:
Using an unspecified index. Your data will be downloaded and filtered on the client. Consider adding ".indexOn": "chats/-L-hPbTK51XFwjNPjz3X" at /users to your security rules for better performance.
Using an unspecified index. Your data will be downloaded and filtered on the client. Consider adding ".indexOn": "users/Ol0XhKBksFcrYmF4MzS3vbODvT83" at /chats to your security rules for better performance.
I found lots of solutions but anything works fine for me. I want to define IndexOn rules in my DB, Can you help me?

Nodejs per client mongoose schema

So I want to offer my users the ability to upload CSV and from that generate a mongoose schema, that I store in the DB against that user. When the user logs in, they can create a collection according to their personal schema. Using generate-schema I am able to create a json schema which looks like:
{
"_id" : ObjectId("596a872cd1e59c6135fa7b2e"),
"title" : "Product Set",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"booktitle" : {
"type" : "string"
},
"bookid" : {
"type" : "string"
},
"bookauthor" : {
"type" : "string"
}
},
"required" : [
"booktitle",
"bookid",
"bookauthor"
],
"title" : "Product"
},
"$schema" : "http://json-schema.org/draft-04/schema#"
}
and store that in my schema collection. All good...
When I want to create a collection according to that schema, and store data in it using mongoose, I have tried to retrieve the schema from the database (which works) and then do
var generatedSchema = GenerateSchema.mongoose(response)
I then create a model from that with:
var Model = db.models.Product || db.model('Product', generatedSchema);
and create an item from that model
var item = new Model({
"_id": new ObjectID(),
booktitle: 'The Godfather',
bookid: 'abc123',
bookauthor: 'Mario Puzo'
});
and save it:
item.save(function(err, response) { ... })
I don't get any errors but when I save it, in the collection I just see:
{
"_id" : ObjectId("5970b1a584d396d7a2241eba"),
"items" : {
"required" : []
},
"__v" : 0
}
Can anyone point me in the right direction as to why this isn't working? My suspicion is I am using the wrong type of schema to create the model.
If someone has an answer to the above, how would you then go about creating methods on the schema, as you would if the schema was static and part of the codebase of the application?
Thanks

Delete / add nested objects in Elastic search

I cannot find examples in the Elastic manual on nested objects on how to modify fields and nested objects of documents using RESTful commands in Kibana Sense. I am looking for something similar to Solrs atomic updates here, which allow to update specific fields of documents.
How do RESTful commands in Kibana Sense look like that accomplish this? The only related info in the manual I can find is on Partial Updates to Documents, but I do not know how this can be applied for this use case.
For example, straight from the Elastic docs:
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"user": {
"type": "nested"
}
}
}
}
}
PUT my_index/my_type/1
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}
How can I delete an entry in the nested object, so that the document "1" looks like:
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
}
]
}
How can I add an entry in the nested object, so that the document "1" looks like:
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
},
{
"first" : "Peter",
"last" : "Parker"
}
]
}
You will have to use scripted updates unless you want to fetch all nested objects then add / remove items and re-index them all which is the previous answer proposed. However if you have a lot of nested documents you should be doing partial updates / additions and deletes. It is much quicker from data transfer and indexing point of view.
Here is a good article how to do scripted updates in general:
https://iridakos.com/programming/2019/05/02/add-update-delete-elasticsearch-nested-objects
Unless I misunderstand your ask, you just post the updated document version to the same document id each time you want.
To delete a nested document (or any field):
PUT my_index/my_type/1
{
"group" : "fans",
"user" : [
{
"first" : "Alice",
"last" : "White"
}
]
}
To add a user, add it to the list:
PUT my_index/my_type/1
{
"group" : "fans",
"user" : [
{
"first" : "Alice",
"last" : "White"
},
{
"first" : "Peter",
"last" : "Parker"
}
]
}
Note: Documents in elasticsearch are immutable. Making a change to a single field causes the entire document to be re-indexed. Nested documents are always re-indexed with the parent document so if you change a field in the parent the nested document is also re-indexed. This can be a performance issue if the nested documents are large and the parents have frequent changes.
For this specific use case, you must use a scripted update. In javascript the call will look something like:
const documentUpdateInstructions = {
index: "index-name",
id: "document-id",
body: {
script: {
lang: "painless",
source: `ctx._source.myNestedObject.removeIf(object -> object.username == params.username);`,
params: {
username: "my_username"
},
},
},
};
await client.update(documentUpdateInstructions);
This takes a document in the form of
document._source = {
...
"myNestedObject": [
{
"username": "my_username",
...
},
{
"username": "not_my_username",
...
}
]
}
and deletes the object inside myNestedObject who's username matches the username provided (in this case my_username). The resulting document will be:
document._source = {
...
"myNestedObject": [
{
"username": "not_my_username",
...
}
]
}

How to create an index with integer fields in Elasticsearch for the JSON file of format?

I am trying to create an index in Elasticsearch for the JSON file of format:
{ "index" : { "_index" : "entity", "_type" : "type1", "_id" : "0" } }
{ "eid":"guid of Event autogenerated", "entityInfo": { "entityType":"qualityevent", "defaultLocale":"en-US" }, "systemInfo": { "tenantId":"67" }, "attributesInfo" : { "jobId":"21", "matchStatus": "new" } }
{ "index" : { "_index" : "entity", "_type" : "type1", "_id" : "1" } }
{ "eid":"guid of Event autogenerated", "entityInfo": { "entityType":"qualityevent", "defaultLocale":"en-US" }, "systemInfo": { "tenantId":"67" }, "attributesInfo" : { "jobId":"20", "matchStatus": "existing" } }
I want the fields jobId and tenantId to be integers.
I am giving the following mapping in curl command:
curl -XPUT http://localhost:9200/entity -d '
{
"mappings": {
"entityInfo":
{
"properties" : {
"entityType" : { "type":"string","index" : "not_analyzed"},
"defaultLocale":{ "type":"string","index" : "not_analyzed"}
}
},
"systemInfo":
{
"properties" : {
"tenantId": { "type" : "integer" }
}
},
"attributesInfo" :
{
"properties" : {
"jobId": { "type" : "integer" },
"matchStatus": { "type":"string","index" : "not_analyzed"}
}
}
}
}
';
This does not give me an error. However, it creates new empty fields jobId and tenantId as integers and it keeps the existing data into attributesInfo.jobId as string. Same is the case with systemInfo.tenantId. I want to use these two fields in Kibana for visualization. I currently cannot use them as they are empty.
I am new to Kibana and Elasticsearch so I am not sure if the mapping is correct.
I have tried couple of other mappings as well but they give errors. Above mapping does not give error.
This is how Discover Tab on Kibana looks like: 1
Please let me know where I am going wrong.
I tried as you mentioned but it didn't help. What I realised after a lot of trial and error that my mapping was incorrect. I finally wrote the correct mapping and now it works correctly. Jobid and TenantId are recognised as numbers by Kibana. I am new to JSON, kibana, Bulk, Elastic so it took time to understand how mapping works.

Generating Mongo query from MySQL query

I have been using the following MySQL command to construct a heatmap from log data. However, I have a new data set that is stored in a Mongo database and I need to run the same command.
select concat(a.packages '&' b.packages) "Concurrent Packages",
count(*) "Count"
from data a
cross join data b
where a.packages<b.packages and a.jobID=b.jobID
group by a.packages, b.packages
order by a.packages, b.packages;
Keep in mind that the tables a and b do not exist prior to the query. However, they are created from the packages column of the data table, which has jobID as the field which I want to check for matches. In other words if two packages are within the same job I want to add an entry to the concurrent usage count. How can I generate a similar query in Mongo?
This is not a "join" of different documents; it is an operation within one document, and can be done in MongoDB.
You have a SQL TABLE "data" like this:
JobID TEXT,
package TEXT
The best way to store this in MongoDB will be a collection called "data", containing one document per JobID that contains an array of packages:
{
_id: <JobID>,
packages: [
"packageA",
"packageB",
....
]
}
[ Note: you could also implement your data table as only one document in MongoDB, containing an array of jobs which contain each an array of packages. This is not recommended, because you might hit the 16MB document size limit and nested arrays are not (yet) well supported by different queries - if you want to use the data for other purposes as well ]
Now, how to get a result like this ?
{ pair: [ "packageA", "packageB" ], count: 20 },
{ pair: [ "packageA", "packageC" ], count: 11 },
...
As there is no built-in "cross join" of two arrays in MongoDB, you'll have to program it out in the map function of a mapReduce(), emitting each pair of packages as a key:
mapf = function () {
that = this;
this.packages.forEach( function( p1 ) {
that.packages.forEach( function( p2 ) {
if ( p1 < p2 ) {
key = { "pair": [ p1, p2 ] };
emit( key, 1 );
};
});
});
};
[ Note: this could be optimized, if the packages arrays were sorted ]
The reduce function is nothing more than summing up the counters for each key:
reducef = function( key, values ) {
count = 0;
values.forEach( function( value ) { count += value } );
return count;
};
So, for this example collection:
> db.data.find()
{ "_id" : "Job01", "packages" : [ "pA", "pB", "pC" ] }
{ "_id" : "Job02", "packages" : [ "pA", "pC" ] }
{ "_id" : "Job03", "packages" : [ "pA", "pB", "pD", "pE" ] }
we get the following result:
> db.data.mapReduce(
... mapf,
... reducef,
... { out: 'pairs' }
... );
{
"result" : "pairs",
"timeMillis" : 443,
"counts" : {
"input" : 3,
"emit" : 10,
"reduce" : 2,
"output" : 8
},
"ok" : 1,
}
> db.pairs.find()
{ "_id" : { "pair" : [ "pA", "pB" ] }, "value" : 2 }
{ "_id" : { "pair" : [ "pA", "pC" ] }, "value" : 2 }
{ "_id" : { "pair" : [ "pA", "pD" ] }, "value" : 1 }
{ "_id" : { "pair" : [ "pA", "pE" ] }, "value" : 1 }
{ "_id" : { "pair" : [ "pB", "pC" ] }, "value" : 1 }
{ "_id" : { "pair" : [ "pB", "pD" ] }, "value" : 1 }
{ "_id" : { "pair" : [ "pB", "pE" ] }, "value" : 1 }
{ "_id" : { "pair" : [ "pD", "pE" ] }, "value" : 1 }
For more information on mapReduce consult: http://docs.mongodb.org/manual/reference/method/db.collection.mapReduce/ and http://docs.mongodb.org/manual/applications/map-reduce/
You can't. Mongo doesn't do joins. Switching from SQL to Mongo is a lot more involved than migrating your queries.
Typically, you would include all the pertinent information in the same record (rather than normalize the information and select it with a join). Denormalize!