How to combine multi field distinct in elasticsearch - mysql

for example, I have six documents here there:
{"task_id": 1, "frame": 1, "job_id": 1},
{"task_id": 1, "frame": 1, "job_id": 1},
{"task_id": 1, "frame": 1, "job_id": 3},
{"task_id": 2, "frame": 1, "job_id": 2},
{"task_id": 2, "frame": 1, "job_id": 3},
{"task_id": 3, "frame": 1, "job_id": 3},
I want get the count of documents with the same task_id.
the expect result must be (the key is "task_id"):
[
{"key": 1, "doc_count": 2},
{"key": 2, "doc_count":2},
{"key": 3, "doc_count":1}
]
Note:the first document and the second document all value is same, so it only calculate once.
So how can I write query in elasticsearch? I can easy write it in SQL, but I puzzled in Elasticsearch.
my mysql query is:
select tmp.task_id, count(*) from (select distinct task_id,frame,job_id from mytable) as tmp group by tmp.task_id

You want to use a terms aggregation on the task_id + frame + job_id fields (using a script), and you'll get the doc_count you are expecting.
curl -XPOST localhost:9200/your_index/_search -d '{
"size" 0,
"aggs" : {
"tasks" : {
"terms" : { "script" : "[doc.task_id.value, doc.frame.value, doc.job_id.value].join(',')" }
}
}
}'
Note that in order to run this, you need to enable dynamic scripting.

Related

I need to compare 2 JSON files in PYTHON with transferring the result to a separate file

I have 2 JSON-files and I need to compare them.
json_new.json
{"company_id": 111111, "resource": "record", "resource_id": 406155061, "status": "create", "data": {"id": 11111111, "company_id": 111111, "services": [{"id": 22222225, "title": "\u0421\u0442\u0440\u0438\u0436\u043a\u0430", "cost": 1500, "cost_per_unit": 1500, "first_cost": 1500, "amount": 1}], "goods_transactions": [], "staff": {"id": 1819441, "name": "\u041c\u0430\u0441\u0442\u0435\u0440"}, "client": {"id": 130345867, "name": "\u041a\u043b\u0438\u0435\u043d\u0442", "phone": "79111111111", "success_visits_count": 2, "fail_visits_count": 0}, "clients_count": 1, "datetime": "2022-01-25T13:00:00+03:00", "create_date": "2022-01-22T00:54:00+03:00", "online": false, "attendance": 2, "confirmed": 1, "seance_length": 3600, "length": 3600, "master_request": 1, "visit_id": 346427049, "created_user_id": 10573443, "deleted": false, "paid_full": 1, "last_change_date": "2022-01-22T00:54:00+03:00", "record_labels": "", "date": "2022-01-22 10:00:00"}}
json_old.json
{"company_id": 111111, "resource": "record", "resource_id": 406155061, "status": "create", "data": {"id": 11111111, "company_id": 111111, "services": [{"id": 9035445, "title": "\u0421\u0442\u0440\u0438\u0436\u043a\u0430", "cost": 1500, "cost_per_unit": 1500, "first_cost": 1500, "amount": 1}], "goods_transactions": [], "staff": {"id": 1819441, "name": "\u041c\u0430\u0441\u0442\u0435\u0440"}, "client": {"id": 130345867, "name": "\u041a\u043b\u0438\u0435\u043d\u0442", "phone": "79111111111", "success_visits_count": 2, "fail_visits_count": 0}, "clients_count": 1, "datetime": "2022-01-25T11:00:00+03:00", "create_date": "2022-01-22T00:54:00+03:00", "online": false, "attendance": 0, "confirmed": 1, "seance_length": 3600, "length": 3600, "master_request": 1, "visit_id": 346427049, "created_user_id": 10573443, "deleted": false, "paid_full": 0, "last_change_date": "2022-01-22T00:54:00+03:00", "record_labels": "", "date": "2022-01-22 10:00:00"}}
In these files, you need to compare the individual parts specified in diff_list:
diff_list = ["services", "staff", "datetime"]
Also code should print result in console, copy and transfer result copy to the file called result.json
My code
import data as data
import json
# JSON string
with open('json_old.json') as json_1:
json1_dict = json.load(json_1)
with open('json_new.json') as json_2:
json2_dict = json.load(json_2)
diff_list = ["services", "staff", "datetime"]
result = [
(sorted(json1_dict.items())),
(sorted(json2_dict.items()))
]
print(sorted(json1_dict.items()) == sorted(json2_dict.items()))
with open('result.json', 'w') as f:
json.dump(result, f)
This code is actually works but I need to catch the change of certain parameters specified in diff_list and output the value: what has changed and for what.
Thank you for your support, guys :)
To catch what has changed between json1_dict and json2_dict, you can use the following one line, making good use of "dictionary comprehension":
changed_items = {k: [json1_dict[k], json2_dict[k]] for k in json1_dict if k in json2_dict and json1_dict[k] != json2_dict[k]}
Every key of changed_items will contain two values, first of json1_dict and second of json2_dict. If the changed_items you are interested in must be the keys in diff_list, then you need instead to change a little the condition within the expression:
changed_items = {k: [json1_dict[k], json2_dict[k]] for k in json1_dict if k in json2_dict and k in diff_list and json1_dict[k] != json2_dict[k]}
all you need afterwards is to print(changed_items)
import json
# Load JSON files
with open('json_old.json') as json_file1:
json1_dict = json.load(json_file1)
with open('json_new.json') as json_file2:
json2_dict = json.load(json_file2)
diff_list = ["services", "staff", "datetime"]
# Compare the parts specified in diff_list and print the differences
result = {}
for key in diff_list:
if json1_dict['data'][key] != json2_dict['data'][key]:
result[key] = {
'old_value': json1_dict['data'][key],
'new_value': json2_dict['data'][key]
}
print(result)
# Write the differences to a result.json file
with open('result.json', 'w') as outfile:
json.dump(result, outfile)
This code snippet will compare the JSON files and print the differences in the parts specified in the diff_list variable to the console. It will also write the differences to a file named result.json.

Add a json object to an existing json object

I would like to add an object into a existing json object :
My existing json object :
{
"id": 1,
"horaire_id": 1,
"json": "{"i": "idHoraire1-NumReservation1", "x": 3, "y": 2, "w":5, "h": 3, "j": 0}",
"num_reserv": 1,
"nombre_heures": 7.5,
"created_at": "2021-09-16T13:55:00.000000Z",
"updated_at": "2021-09-16T13:55:03.000000Z"
}
And the object to insert into the existing json object :
{
"id": 1,
"evenement_type_id": 1,
"reservation_id": 1,
"posinreserv": 1,
"campus_id": 1,
"comment": null,
"created_at": null,
"updated_at": null
}
We tried json_encode and json_decode, without success :
$reservation = json_decode($reservation, true)
All this with Laravel 7 (not is js)
Thank you.
$data1 = '{
"id": 1,
"horaire_id": 1,
"json": "{"i": "idHoraire1-NumReservation1", "x": 3, "y": 2, "w":5, "h": 3, "j": 0}",
"num_reserv": 1,
"nombre_heures": 7.5,
"created_at": "2021-09-16T13:55:00.000000Z",
"updated_at": "2021-09-16T13:55:03.000000Z"
}';
$data2 = '{
"id": 1,
"evenement_type_id": 1,
"reservation_id": 1,
"posinreserv": 1,
"campus_id": 1,
"comment": null,
"created_at": null,
"updated_at": null
}';
Laravel
$array = array_merge($data1->toArray(), $data2->toArray());
PHP
$array = array_merge(array($data1), array($data2));
You can use the spread operator: (note json objects are just arrays in php, assuming you've json_decoded it)
$result = [...$array_1, ...$array_2]
or
$result = array_merge($array_1, $array_2)

How to Remove and update in one obj of child obj in JSON Using mysql

Given the following JSON stored inside a MySQL json data type:
var data = ' [
{
"key": 1,
"step": 6,
"param": [
{"key_1": "test1"},
{"key_2": "test2"},
{"key_3": "test3"}
]
},
{
"key": 4,
"step": 8,
"param": [
{"key_4": "test4"},
{"key_5": "test5"}
]
}
]';
I need to remove key_3 in param obj also update removed data in mysql using one query.
**Note:**I Don't know the key_3 equal value, I Have only key_1 want to remove {"key_1":"test1"}
OUTPUT
[
{
"key": 1,
"step": 6,
"param": [
{"key_2": "test2"},
{"key_3": "test3"}
]
},
{
"key": 4,
"step": 8,
"param": [
{"key_4": "test4"},
{"key_5": "test5"}
]
}
]
Have you tried the function JSON_REMOVE in your attempts to achieve what you want?

jq: unnesting records and mixing fields from both record levels

I have the following file:
[
{
'id': 1,
'arr': [{'x': 1,
{'x': 2}]
},
{
'id': 2,
'arr': [{'x': 3},
{'x': 4}]
}
]
How can I transform it into the following form using jq?
[
{'id': 1, 'x': 1},
{'id': 1, 'x': 2},
{'id': 2, 'x': 3},
{'id': 2, 'x': 4},
]
Assuming it doesn't get any more complex than that, you could simply do this:
map(del(.arr) + .arr[])
This is under the assumption that you're replacing the arr property of each object with the contents of the items in arr. It's unclear what you're trying to do exactly.
The input shown in the question is not valid JSON. After making some minor changes to make it valid JSON, the following filter produces the output as shown below:
map( (.arr[]|.x) as $x | {id, "x": $x} )
Output:
[
{
"id": 1,
"x": 1
},
{
"id": 1,
"x": 2
},
{
"id": 2,
"x": 3
},
{
"id": 2,
"x": 4
}
]

Delete element of Solr

I deleted an item you do not need to solr, but I solr response still appears.
The json:
{
"responseHeader": {
"status": 0,
"QTime": 1,
"params": {
"facet": "true",
"q": "*:*",
"facet.limit": "-1",
"facet.field": "manufacturer",
"wt": "json",
"rows": "0"
}
},
"response": {
"numFound": 84,
"start": 0,
"docs": []
},
"facet_counts": {
"facet_queries": {},
"facet_fields": {
"manufacturer": [
"Chevrolet",
0,
"abarth",
1,
"audi",
7,
"austin",
1,
"bmw",
2,
"daewoo",
2,
"ford",
1,
"fso",
1,
"honda",
1,
"hyundai",
1,
"jaguar",
3,
"lexus",
1,
"mazda",
1,
"mitsubishi",
1,
"nissan",
1,
"pontiac",
1,
"seat",
1
]
},
"facet_dates": {},
"facet_ranges": {}
}
}
the deleted item is "chevrolet", now this to '0 'but it still appears.
"manufacturer":["Chevrolet",0,
I wish I could delete the item completely, is that possible.. Thanks.
Here is a two step approach I would follow:
Make sure changes(deletion) is committed. You may issue a commit
If it still shows facets with zero count, you may append &facet.mincount=1 to your query
&facet.mincount=1 will make sure facets with zero count do not show up.
For more details, please refer to: http://wiki.apache.org/solr/SimpleFacetParameters#facet.mincount
In your case probably it is because of uninverted index created by solr.
Pass facet.mincount=1 in your query to get rid of this problem.