I have a MongoDB document with over 2.8m documents of common passwords (hashed in SHA1) and their popularity.
Currently I've imported the documents with the following schema
{"_id":"5ded1a559015155eb8295f48","password":"20EABE5D64B0E216796E834F52D61FD0B70332FC:2512537"}
Although I'd like to split this so I can have the popularity value and it would look something like this
{"_id":"5ded1a559015155eb8295f48","password":"20EABE5D64B0E216796E834F52D61FD0B70332FC","popularity":2512537}
Question is im unsure how I can split the password into two password, popularity using : to split the string
You can use Aggregation Framework to split current password into two fields. You need to start with $indexOfBytes to get the position of : and then you need $substr to create new fields based on evaluated position.
db.collection.aggregate([
{
$addFields: {
colonPos: { $indexOfBytes: ["$password",":"] }
}
},
{
$addFields: {
password: { $substr: [ "$password", 0, "$colonPos" ] },
popularity: { $substr: [ "$password", "$colonPos", { $strLenBytes: "$password" } ] }
}
},
{
$project: {
colonPos: 0
}
}
])
Mongo Playground
As a last step you can use $out which takes all your aggregation results and writes them into new or existing collection.
EDIT: Alternative approach using $split (thank to #matthPen):
db.collection.aggregate([
{
$addFields: {
password: { $arrayElemAt: [ { "$split": [ "$password", ":"] }, 0 ] },
popularity: { $arrayElemAt: [ { "$split": [ "$password", ":"] }, 1 ] }
}
}
])
Mongo Playground
Related
I am trying to query a document in my MongoDB
Document:
{
_id: '111',
subEntities: [
{
subId: '999',
dateOfStart: '2098-01-01',
dateOfTermination: '2099-12-31'
},
{
subId: '998',
dateOfStart: '2088-01-01',
dateOfTermination: '2089-12-31'
}
]
}
My Query:
{"$and": [
{"subEntities.dateOfStart": {"$lte": "2098-01-02"}},
{"subEntities.dateOfTermination": {"$gte": "2099-12-30"}},
{"subEntities.subId": {"$in": ["998"]}}
]}
As you can see, I am trying to apply a date value and an ID to the subentities.
The date value should be between dateOfStart and dateOfTermination.
The query returns a match, although the date value only matches the first subentity and the ID query matches the second subquery.
How can I make it so that there is only one match when both queries match the same subentity?
Can I aggregate the subentities?
Thanks a lot!
When you query arrays Mongo by default "flattens" them, which means each condition of the query get's executed independently.
You want to be using $elemMatch, this allows you to query full objects from within an array, like so:
db.collection.find({
subEntities: {
$elemMatch: {
dateOfStart: {
"$lte": "2098-01-02"
},
dateOfTermination: {
"$gte": "2099-12-30"
},
subId: {
"$in": [
"998"
]
}
}
}
})
Mongo Playground
If you want to filter dates between dateOfStart and dateOfTermination you should invert the $gte and $lte conditions:
{
"$and": [
{ "subEntities.dateOfStart": { "$gte": "2098-01-02" } },
{ "subEntities.dateOfTermination": { "$lte": "2099-12-30" } },
{ "subEntities.subId": { "$in": ["998"] } }
]
}
I am a beginner to use mogoDB (json/noSQL). Please help me using the query in mogoDB. In SQL i use like :
select * from account where 'status'='active' and
('create_by'= 'USE001' or 'create_by' = 'USE004' or 'create_by' = 'USE035')".
In mogoDB (json/noSQL), How can i do it ??
This is my data structure:
{
"id":"ACC0001",
"create_day":"2020-04-20 16:56:11",
"create_by":"USE001",
"brief_name":"AAAAA",
"status":"active"
},
{
"id":"ACC0002",
"create_day":"2020-04-20 16:56:12",
"create_by":"USE002",
"brief_name":"BBBBB",
"status":"inactive"
},
{
"id":"ACC0003",
"create_day":"2020-04-20 16:56:13",
"create_by":"USE003",
"brief_name":"CCCCC",
"status":"active"
},
{
"id":"ACC0004",
"create_day":"2020-04-20 16:56:14",
"create_by":"USE004",
"brief_name":"DDDDD",
"status":"inactive"
},
{
"id":"ACC9999",
"create_day":"2020-04-20 16:56:15",
"create_by":"USE100",
"brief_name":"FFFFF",
"status":"active"
}
We can translate your query into the following, note my use of the $in operator.
db.account.find({
status: "active",
create_by: { $in: [ "USE001", "USE004", "USE035" ] }
})
You could technically also run this as the query below, however, mongo recommends using $in over $or if you're doing equality checks on the same field. I'm mentioning this since you weren't using IN in your original MySQL query. See: https://docs.mongodb.com/manual/reference/operator/query/or/#or-versus-in
db.account.find({
status: "active",
$or: [
{ create_by: "USE001" },
{ create_by: "USE004" },
{ create_by: "USE035" }
]
})
Here is the sample JSON
Sample JSON:
[
{
"_id": "123456789",
"YEAR": "2019",
"VERSION": "2019.Version",
"QUESTION_GROUPS": [
{
"QUESTIONS": [
{
"QUESTION_NAME": "STATE_CODE",
"QUESTION_VALUE": "MH"
},
{
"QUESTION_NAME": "COUNTY_NAME",
"QUESTION_VALUE": "IN"
}
]
},
{
"QUESTIONS": [
{
"QUESTION_NAME": "STATE_CODE",
"QUESTION_VALUE": "UP"
},
{
"QUESTION_NAME": "COUNTY_NAME",
"QUESTION_VALUE": "IN"
}
]
}
]
}
]
Query that am using :
db.collection.find({},
{
"QUESTION_GROUPS.QUESTIONS.QUESTION_NAME": "STATE_CODE"
})
My requirement is retrive all QUESTION_VALUE whose QUESTION_NAME is equals to STATE_CODE.
Thanks in Advance.
If I get you well, What you are trying to do is something like:
db.collection.find(
{
"QUESTION_GROUPS.QUESTIONS.QUESTION_NAME": "STATE_CODE"
},
{
"QUESTION_GROUPS.QUESTIONS.QUESTION_VALUE": 1
})
Attention: you will get ALL the "QUESTION_VALUE" for ANY document which has a QUESTION_GROUPS.QUESTIONS.QUESTION_NAME with that value.
Attention 2: You will get also the _Id. It is by default.
In case you would like to skip those issues, you may need to use Aggregations, and unwind the "QUESTION_GROUPS"-> "QUESTIONS". This way you can skip both the irrelevant results, and the _id field.
It sounds like you want to unwind the arrays and grab only the question values back
Try this
db.collection.aggregate([
{
$unwind: "$QUESTION_GROUPS"
},
{
$unwind: "$QUESTION_GROUPS.QUESTIONS"
},
{
$match: {
"QUESTION_GROUPS.QUESTIONS.QUESTION_NAME": "STATE_CODE"
}
},
{
$project: {
"QUESTION_GROUPS.QUESTIONS.QUESTION_VALUE": 1
}
}
])
I have a JSON like below: I need to extract the Options -> Child as a Random and also Values within the options as randomly. How can we achieve in jmeter ?
{
"id":37,
"merchant_id":"39",
"title":"Parker Pens",
"subtitle":null,
"price":1000,
"description":null,
"images":[ ],
"image_thumbs":[ ],
"options":[
{
"code":"color",
"label":"Color",
"extra_info":"",
"values":[
{ },
{ },
{ }
]
},
{
"code":"size",
"label":"Size",
"extra_info":"",
"values":[
{ },
{ },
{ }
]
}
],"options_available":[
{ },
{ },
{ },
{ },
{ },
{ },
{ },
{ },
{ }
], "custom_options":[
]
}
I have to fetch the child of options randomly . In that i have to fetch the value of "Code" and its associated value within the "Value" .
Help is appreciated and useful
Your requirements are a little bit vague as you haven't indicated what is the desired output format. One of the solutions would be using JSR223 PostProcessor in order to obtain the random value from random options array like:
import com.jayway.jsonpath.JsonPath
import org.apache.commons.lang3.RandomUtils
import org.apache.jmeter.samplers.SampleResult
def options = JsonPath.read(prev.getResponseDataAsString(), '$.options')
def randomOption = options.get(RandomUtils.nextInt(0, options.size()))
def values = randomOption.get('values')
def randomValue = values.get(RandomUtils.nextInt(0, values.size())) as String
vars.put('randomValue', randomValue)
References:
Jayway JsonPath - A Java DSL for reading JSON documents
Apache Groovy - Why and How You Should Use It
Apache Groovy - Parsing and Producing JSON
I'm running an aggregate through PyMongo.
The aggregate, formatted fairly nicely, looks like this:
[{
$match: {
syscode: {
$in: [598.0]
},
date: {
$gte: newDate(1509487200000),
$lte: newDate(1510264800000)
}
}
},
{
$group: {
_id: {
date: "$date",
start_date: "$start_date",
end_date: "$end_date",
daypart: "$daypart",
network: "$network"
},
syscode_data: {
$push: {
syscode: "$syscode",
cpm: "$cpm"
}
}
}
}]
It returns no results when I use the .explode methods on its cursor in Python.
When I run it through NoSQL Booster for MongoDB, I get the results back. That said, the Mongo log files don't change from what I'm seeing when I run it through PyMongo.
When I look at the Mongo logs, there's an additional group by pipeline added to them. Apparently the Booster knows what to do with this and I don't.
{ $group: { _id: null, count: { $sum: 1.0 } } }
This is the full log line I see.
2018-03-11T21:05:04.374+0200 I COMMAND [conn71] command Customer.weird_stuff command: aggregate { aggregate: "rate_cards", pipeline: [ { $match: { syscode: { $in: [ 598.0 ] }, date: { $gte: new Date(1509487200000), $lte: new Date(1510264800000) } } }, { $group: { _id: { date: "$date", start_date: "$start_date", end_date: "$end_date", daypart: "$daypart", network: "$network" }, syscode_data: { $push: { syscode: "$syscode", cpm: "$cpm" } } } }, { $group: { _id: null, count: { $sum: 1.0 } } } ], cursor: { batchSize: 1000.0 }, $db: "Customer" } planSummary: COLLSCAN keysExamined:0 docsExamined:102900 cursorExhausted:1 numYields:803 nreturned:1 reslen:134 locks:{ Global: { acquireCount: { r: 1610 } }, Database: { acquireCount: { r: 805 } }, Collection: { acquireCount: { r: 805 } } } protocol:op_query 122ms
What's going on? How do I handle this from the Python side?
Notes as I'm digging: this pipeline runs when I get lucky and use an unordered dictionary (default) with Pymongo. When I run the input JSON through the JSON.Jsondecoder with the line:
json.JSONDecoder(object_pairs_hook=OrderedDict).decode(parsed_param)
the output has a very complex format (necessary due to the pipeline needing to maintain its order) and ends up passing that extra piece.
So, lacking interest I found a workaround. Examining the problem, I found that when I added an additional step to the pipeline ({"$sort": {"_id": 1}}) the translation from Python dictionary to Mongo JSON aggregate didn't generate the extra JSON object.
This is a poor answer, but I think the root cause is that the conversion between complex ordered dictionaries and Mongo JSON queries in this particular environment has a little tiny bug that affected this particular query.
I would be excited to go find it and examine it further, but I'm buried at a new job.