I am learning elasticsearch and following along with the tutorial. I uploaded three documents into an index. When I supply the following query:
curl 'localhost:9200/vehicles/_search?query=driver.name:Jon'
I as expected get back object two and object three. However when I try querying using json:
curl localhost:9200/vehicles/_search -d'
{
"query":{
"prefix":{
"driver.name":"Jon"
}}}'
I get no results back. I am following the tutorial very closely, so I don't understand what the issue is. Any help would be really appreciated. The uploaded objects are below.
Thank you!
id:one
'{
"color": "green",
"driver": {
"born":"1989-09-12",
"name": "Ben"
},
"make": "BMW",
"model": "Aztek",
"value": 3000.0,
"year": 2003
}'
id:two
'{
"color": "black",
"driver": {
"born":"1934-09-08",
"name": "Jon"
},
"make": "Mercedes",
"model": "Benz",
"value": 10000.0,
"year": 2012
}'
id:three
'{
"color": "green",
"driver": {
"born":"1934-09-08",
"name": "Jon"
},
"make": "BMW",
"model": "Benz",
"value": 10000.0,
"year": 2012
}'
The prefix-query "matches documents that have fields containing terms with a specified prefix (not analyzed)".
Note the "not analyzed"-part. Lucene is looking for anything starting with "Jon" in the index, but the standard analyzer lowercases terms. That is, "jon" is in the index, but "Jon" is not.
Thus, if you lowercase the text in your prefix-query, it should work. Here is a runnable example: https://www.found.no/play/gist/7629456
Try:
curl -XGET "http://localhost:9200/vehicles/_search" -d '
{
"query": {"query_string" : { "query" : "driver.name:Jon" }}
}'
In any case, If you are new to elasticsearch I really recommend you read the documentation because there are lots of types of queries. Besides, the results of queries also depends on how you index the documents, how you define the mapping, etc.
In order to use the prefix query, you need to hit a non-analyzed field. In your mappings for driver.name, if you set "index" to "not_analyzed", you can use the prefix query. Otherwise, you should use a match query or something similar.
Related
Here is a simplified json file of a terraform state file (let's call it dev.ftstate)
{
"version": 4,
"terraform_version": "0.12.9",
"serial": 2,
"lineage": "ba56cc3e-71fd-1488-e6fb-3136f4630e70",
"outputs": {},
"resources": [
{
"module": "module.rds.module.reports_cpu_warning",
"mode": "managed",
"type": "datadog_monitor",
"name": "alert",
"each": "list",
"provider": "module.rds.provider.datadog",
"instances": []
},
{
"module": "module.rds.module.reports_lag_warning",
"mode": "managed",
"type": "datadog_monitor",
"name": "alert",
"each": "list",
"provider": "module.rds.provider.datadog",
"instances": []
},
{
"module": "module.rds.module.cross_region_replica_lag_alert",
"mode": "managed",
"type": "datadog_monitor",
"name": "alert",
"each": "list",
"provider": "module.rds.provider.datadog",
"instances": []
},
{
"module": "module.rds",
"mode": "managed",
"type": "aws_db_instance",
"name": "master",
"provider": "provider.aws",
"instances": [
{
"schema_version": 0,
"attributes": {
"address": "dev-database.123456.us-east-8.rds.amazonaws.com",
"allocated_storage": 10,
"password": "",
"performance_insights_enabled": false,
"tags": {
"env": "development"
},
"timeouts": {
"create": "6h",
"delete": "6h",
"update": "6h"
},
"timezone": "",
"username": "admin",
"vpc_security_group_ids": [
"sg-1234"
]
},
"private": ""
}
]
}
]
}
There are many modules at the same level of module.rds inside the instances. I took out many of them to create the simplified version of the raw data. The key takeway: do not assume the array index will be constant in all cases.
I wanted to extract the password field in the above example.
My first attempt is to use equality check to extract the relevant modules
` jq '.resources[].module == "module.rds"' dev.tfstate`
but it actually just produced a list of boolean values. I don't see any mention of builtin functions like filter in jq's manual
then I tried to just access the field:
> jq '.resources[].module[].attributes[].password?' dev.tfstate
then it throws the following error
jq: error (at dev.tfstate:1116): Cannot iterate over string ("module.rds")
So what is the best way to extract the value? Hopefully it can only focus on the password attribute in module.rds module only.
Edit:
My purpose is to detect if a password is left inside a state file. I want to ensure the passwords are exclusively stored in AWS secret manager.
You can extract the module you want like this.
jq '.resources[] | select(.module == "module.rds")'
I'm not confident that I understand the requirements for the rest of the solution. So this might not only not be the best way of doing what you want; it might not do what you want at all!
If you know where password will be, you can do this.
jq '.resources[] | select(.module == "module.rds") | .instances[].attributes.password'
If you don't know exactly where password will be, this is a way of finding it.
jq '.resources[] | select(.module == "module.rds") | .. | .password? | values'
According to the manual under the heading "Recursive Descent," ..|.a? will "find all the values of object keys “a” in any object found “below” ."
values filters out the null results.
You could also get the password value out of the state file without jq by using Terraform outputs. Your module should define an output with the value you want to output and you should also output this at the root module.
Without seeing your Terraform code you'd want something like this:
modules/rds/main.tf
resource "aws_db_instance" "master" {
# ...
}
output "password" {
value = aws_db_instance.master.password
sensitive = true
}
example/main.tf
module "rds" {
source = "../modules/rds"
# ...
}
output "rds_password" {
value = module.rds.password
sensitive = true
}
The sensitive = true parameter means that Terraform won't print the output to stdout when running terraform apply but it's still held in plain text in the state file.
To then access this value without jq you can use the terraform output command which will retrieve the output from the state file and print it to stdout. From there you can use it however you want.
I'm currently trying to do a basic JSON file import into my ELK stack. I tried importing it directly via a POST request like this:
curl -XPOST http://localhost:9200/kwd_results/TS_Cart -d #/home/local/TS_Cart.json
ES says ok for the import, but when I'm trying to view the logs in Kibanna, they are not indexed by the nodes of the JSON file. I'm guessing I need like a template mapping to view it properly.
My JSON file looks like this:
{
"testResults": {
"FitNesseVersion": "v20160618",
"rootPath": "K1System.CountryDe.DriverFirefox.TestCases.MainFolder.TestVariants.SmokeTests_B2C.TS_Cart",
"result": [
{
"counts": {
"right": "16",
"wrong": "2",
"ignores": "3",
"exceptions": "1"
},
"date": "2017-05-10T00:01:11+02:00",
"runTimeInMillis": "117242",
"relativePageName": "TestCase_1",
"pageHistoryLink": "K1System.CountryDe.DriverFirefox.TestCases.MainFolder.TestVariants.SmokeTests_B2C.TS_Cart.B2CFreeCatalogueOrder?pageHistory&resultDate=20170510000111",
"tags": "de, at"
},
{
"counts": {
"right": "16",
"wrong": "0",
"ignores": "0",
"exceptions": "0"
},
"date": "2017-05-10T00:03:08+02:00",
"runTimeInMillis": "85680",
"relativePageName": "TestCase_2",
"pageHistoryLink": "K1System.CountryDe.DriverFirefox.TestCases.MainFolder.TestVariants.SmokeTests_B2C.TS_Cart.B2CGiftCardOrderWithAdvancePayment?pageHistory&resultDate=20170510000308",
"tags": "at, de"
}
],
"finalCounts": {
"right": "4",
"wrong": "1",
"ignores": "0",
"exceptions": "0"
},
"totalRunTimeInMillis": "482346"
}
}
Basically I would need rootPath to be used as an index, while having the following childs: counts, relativePageName, date and tags. Notice that I have two nodes that are childs of the result[] array.
Any help would be greatly appreciated!
Thank you.
Well, it's one JSON document so Elasticsearch treats it as such.
You'll need to (programmatically) split up the document into the right documents and then you can store them (potentially with one _bulk request).
For the index name:
Must be lowercase, so you'll need to cast that value.
Will you have many different root paths with jut a few docs each? Then you shouldn't make all of them an index since there is an overhead for each one of them (actually the underlying shards).
I have a multidimensional array that I want to index with CouchDB (really using Cloudant). I have users which have a list of the teams that they belong to. I want to search to find every member of that team. So, get me all the User objects that have a team object with id 79d25d41d991890350af672e0b76faed. I tried to make a json index on "Teams.id", but it didn't work because it isn't a straight array but a multidimensional array.
User
{
"_id": "683be6c086381d3edc8905dc9e948da8",
"_rev": "238-963e54ab838935f82f54e834f501dd99",
"type": "Feature",
"Kind": "Profile",
"Email": "gc#gmail.com",
"FirstName": "George",
"LastName": "Castanza",
"Teams": [
{
"id": "79d25d41d991890350af672e0b76faed",
"name": "First Team",
"level": "123"
},
{
"id": "e500c1bf691b9cfc99f05634da80b6d1",
"name": "Second Team Name",
"level": ""
},
{
"id": "4645e8a4958421f7d843d9b34c4cd9fe",
"name": "Third Team Name",
"level": "123"
}
],
"LastTeam": "79d25d41d991890350af672e0b76faed"
}
This is a lot like my response at Cloudant Selector Query but here's the deal, applied to your question:
The easiest way to run this query is using "Cloudant Query" (or "Mango", as it's called in the forthcoming CouchDB 2.0 release) -- and not the traditional MapReduce view indexing system in CouchDB. (This blog covers the differences: https://cloudant.com/blog/mango-json-vs-text-indexes/ and this one is an overview: https://developer.ibm.com/clouddataservices/2015/11/24/cloudant-query-json-index-arrays/).
Here's what your CQ index should look like:
{
"index": {
"fields": [
{"name": "Teams.[].id", "type": "string"}
]
},
"type": "text"
}
And what the subsequent query looks like:
{
"selector": {
"Teams": {"$elemMatch": {"id": "79d25d41d991890350af672e0b76faed"}}
},
"fields": [
"_id",
"FirstName",
"LastName"
]
}
You can try it yourself in the "Query" section of the Cloudant dashboard or via curl with something like this:
curl -H "Content-Type: application/json" -X POST -d '{"selector":{"Teams":{"$elemMatch":{"id":"79d25d41d991890350af672e0b76faed"}}},"fields":["_id","FirstName","LastName"]}' https://broberg.cloudant.com/teams_test/_find
That database is world-readable, so you can see the sample documents I created in there here: https://broberg.cloudant.com/teams_test/_all_docs?include_docs=true
Dig the Seinfeld theme :D
You simply need to loop through the Teams array and emit a view entry for each of the teams.
function (doc) {
if(doc.Kind === "Profile"){
for (var i=0; i<doc.Teams.length; i++) {
var team = doc.Teams[i];
emit(team.id, [doc.FirstName, doc.LastName]);
}
}
}
You can then query for all profiles with a specific team id by keying on the team id like this
.../view?key="79d25d41d991890350af672e0b76faed"
giving
{"total_rows":7,"offset":2,"rows":[
{"id":"0d15041f43b43ae07e8faa737f00032c","key":"79d25d41d991890350af672e0b76faed","value":["Adam","Alpha"]},
{"id":"68779729be3610fd8b52b22574000ae8","key":"79d25d41d991890350af672e0b76faed","value":["Bob","Bravo"]},
{"id":"9f97f1565f03aebae9ca73e207001ee1","key":"79d25d41d991890350af672e0b76faed","value":["Chuck","Charlie"]}
]}
or you can include the actual profiles in the result by adding &include_docs=true to the query.
I want to be able to access deeper elements stored in a json in the field json, stored in a postgresql database. For example, I would like to be able to access the elements that traverse the path states->events->time from the json provided below. Here is the postgreSQL query I'm using:
SELECT
data#>> '{userId}' as user,
data#>> '{region}' as region,
data#>>'{priorTimeSpentInApp}' as priotTimeSpentInApp,
data#>>'{userAttributes, "Total Friends"}' as totalFriends
from game_json
WHERE game_name LIKE 'myNewGame'
LIMIT 1000
and here is an example record from the json field
{
"region": "oh",
"deviceModel": "inHouseDevice",
"states": [
{
"events": [
{
"time": 1430247045.176,
"name": "Session Start",
"value": 0,
"parameters": {
"Balance": "40"
},
"info": ""
},
{
"time": 1430247293.501,
"name": "Mission1",
"value": 1,
"parameters": {
"Result": "Win ",
"Replay": "no",
"Attempt Number": "1"
},
"info": ""
}
]
}
],
"priorTimeSpentInApp": 28989.41467999999,
"country": "CA",
"city": "vancouver",
"isDeveloper": true,
"time": 1430247044.414,
"duration": 411.53,
"timezone": "America/Cleveland",
"priorSessions": 47,
"experiments": [],
"systemVersion": "3.8.1",
"appVersion": "14312",
"userId": "ef617d7ad4c6982e2cb7f6902801eb8a",
"isSession": true,
"firstRun": 1429572011.15,
"priorEvents": 69,
"userAttributes": {
"Total Friends": "0",
"Device Type": "Tablet",
"Social Connection": "None",
"Item Slots Owned": "12",
"Total Levels Played": "0",
"Retention Cohort": "Day 0",
"Player Progression": "0",
"Characters Owned": "1"
},
"deviceId": "ef617d7ad4c6982e2cb7f6902801eb8a"
}
That SQL query works, except that it doesn't give me any return values for totalFriends (e.g. data#>>'{userAttributes, "Total Friends"}' as totalFriends). I assume that part of the problem is that events falls within a square bracket (I don't know what that indicates in the json format) as opposed to a curly brace, but I'm also unable to extract values from the userAttributes key.
I would appreciate it if anyone could help me.
I'm sorry if this question has been asked elsewhere. I'm so new to postgresql and even json that I'm having trouble coming up with the proper terminology to find the answers to this (and related) questions.
You should definitely familiarize yourself with the basics of json
and json functions and operators in Postgres.
In the second source pay attention to the operators -> and ->>.
General rule: use -> to get a json object, ->> to get a json value as text.
Using these operators you can rewrite your query in the way which returns correct value of 'Total Friends':
select
data->>'userId' as user,
data->>'region' as region,
data->>'priorTimeSpentInApp' as priotTimeSpentInApp,
data->'userAttributes'->>'Total Friends' as totalFriends
from game_json
where game_name like 'myNewGame';
Json objects in square brackets are elements of a json array.
Json arrays may have many elements.
The elements are accessed by an index.
Json arrays are indexed from 0 (the first element of an array has an index 0).
Example:
select
data->'states'->0->'events'->1->>'name'
from game_json
where game_name like 'myNewGame';
-- returns "Mission1"
select
data->'states'->0->'events'->1->>'name'
from game_json
where game_name like 'myNewGame';
This did help me
I have a very complex JSON and a snippet of it is below:
var designerJSON=
{
"nodes":
[
{
"NodeDefinition": {
"name": "Start",
"thumbnail": "Start.png",
"icon": "Start.png",
"info": "Entry point ",
"help": "Start point in your workflow.",
"workflow ": "Start",
"category": "Basic",
"ui": [
{
"label": "Entry point",
"category": "Help",
"componet": "label",
"type": "label"
}
]
},
"States": [
{
"start": "node1"
}
]
},.......
]
}
I would like to get the value of "start" in States. But I am stuck in the first step of entering into JSON. When I try
console.log(designerJSON["nodes"]);
I am getting Undefined.
I want the value of start. Wich is designerJSON["nodes"]["States"]["start"].
Can you help.
Thanks in advance
designerJSON["nodes"]["States"]["start"] won't do it.
designerJSON["nodes"] is a list, as is States, so you need to access individual items by index (or iteration).
In the example you have given you need to use this:
designerJSON['nodes'][0]['States'][0]['start']
or this (cleaner IMO):
designerJSON.nodes[0].States[0].start
You have an array in JSON.
instead of
designerJSON["nodes"]["States"]["start"]
use
designerJSON["nodes"][0]["States"][0]["start"]
ps. pay attention on how code is formatted in the topic.
pps. using brackets for accessing properties in js is "bad style" (due to js hint recommendations). better access those via dot, e.g:
designerJSON.nodes[0].States[0].start