What FRAGS argument does in RediSearch FT.SEARCH command? - redisearch

I looked through RediSearch documentation and the syntax of FT.SEARCH below:
FT.SEARCH {index} {query} [NOCONTENT] [VERBATIM] [NOSTOPWORDS] [WITHSCORES] [WITHPAYLOADS] [WITHSORTKEYS]
[FILTER {numeric_field} {min} {max}] ...
[GEOFILTER {geo_field} {lon} {lat} {raius} m|km|mi|ft]
[INKEYS {num} {key} ... ]
[INFIELDS {num} {field} ... ]
[RETURN {num} {field} ... ]
[SUMMARIZE [FIELDS {num} {field} ... ] [FRAGS {num}] [LEN {fragsize}] [SEPARATOR {separator}]]
[HIGHLIGHT [FIELDS {num} {field} ... ] [TAGS {open} {close}]]
[SLOP {slop}] [INORDER]
[LANGUAGE {language}]
[EXPANDER {expander}]
[SCORER {scorer}]
[PAYLOAD {payload}]
[SORTBY {field} [ASC|DESC]]
[LIMIT offset num]
I cannot find anywhere description of FRAGS. What FRAGS argument does in RediSearch FT.SEARCH command? Is there any limit for {num} parameter?

a fragment is a substring of the search result which contains contextual information related to the search term; for example, when searching for "Redis", a fragment in a document might be
"found. Users liked Redis more than other..."
This is part of the summarization feature

Related

How to get object name when it contains DOTs, CURLY BRACES and HASHTAGs on JSONPath?

I have the following JSON structure, generated by Zabbix Discovery key, with the following data:
[{
"{#SERVICE.NAME}": ".WindowsService1",
"{#SERVICE.DISPLAYNAME}": ".WindowsService1 - Testing",
"{#SERVICE.DESCRIPTION}": "Application Test 1 - Master",
"{#SERVICE.STATE}": 0,
"{#SERVICE.STATENAME}": "running",
"{#SERVICE.PATH}": "E:\\App\\Test\\bin\\testingApp.exe",
"{#SERVICE.USER}": "LocalSystem",
"{#SERVICE.STARTUPTRIGGER}": 0,
"{#SERVICE.STARTUP}": 1,
"{#SERVICE.STARTUPNAME}": "automatic delayed"
},
{
"{#SERVICE.NAME}": ".WindowsService2",
"{#SERVICE.DISPLAYNAME}": ".WindowsService2 - Testing",
"{#SERVICE.DESCRIPTION}": "Application Test 2 - Slave",
"{#SERVICE.STATE}": 0,
"{#SERVICE.STATENAME}": "running",
"{#SERVICE.PATH}": "E:\\App\\Test\\bin\\testingApp.exe",
"{#SERVICE.USER}": "LocalSystem",
"{#SERVICE.STARTUPTRIGGER}": 0,
"{#SERVICE.STARTUP}": 1,
"{#SERVICE.STARTUPNAME}": "automatic delayed"
}]
So, what i want to do is: Use JSONPath to get ONLY the object that {#SERVICE.NAME} == WindowsService1...
The problem is, i am trying to create the JSONPath but it's giving me a couple of errors.
Here's what i tried, and what i discovered so far:
JSONPath:
$.[?(#.{#SERVICE.NAME} == '.WindowsService1')]
Error output:
jsonPath: Unexpected token '{': _$_v.{#SERVICE.NAME} ==
'.WindowsService1'
I also tried doing the following JSONPath, to match Regular Expression:
$.[?(#.{#SERVICE.NAME} =~ '^(.WindowsService1$)')]
It gave me the same error - So the problem is not after the == or =~ ...
What i discovered is, if i REMOVE the curly braces {}, the hashtag # and replace the dot . in "Service name" with _ (Underline), in JSONPath and in JSON data, it works, like this:
Data without # {} . :
[{
"SERVICE_NAME": ".WindowsService1",
[...]
JSONPath following new data structure:
$.[?(#.SERVICE_NAME == '.WindowsService1')]
But the real problem is, i need to maintain the original strucutre, with the curly braces, dots, and hashtags...
How can i escape those and stop seeing this error?
Thank you...
$.[?(#['{#SERVICE.NAME}'] == '.WindowsService1')]

Analysing and formatting JSON using PostgreSQL

I have a table called api_details where i dump the below JSON value into the JSON column raw_data.
Now i need to make a report from this JSON string and the expected output is something like below,
action_name. sent_timestamp Sent. Delivered
campaign_2475 1600416865.928737 - 1601788183.440805. 7504. 7483
campaign_d_1084_SUN15_ex 1604220248.153903 - 1604222469.087918. 63095. 62961
Below is the sample JSON OUTPUT
{
"header": [
"#0 action_name",
"#1 sent_timestamp",
"#0 Sent",
"#1 Delivered"
],
"name": "campaign - lifetime",
"rows": [
[
"campaign_2475",
"1600416865.928737 - 1601788183.440805",
7504,
7483
],
[
"campaign_d_1084_SUN15_ex",
"1604220248.153903 - 1604222469.087918",
63095,
62961
],
[
"campaign_SUN15",
"1604222469.148829 - 1604411016.029794",
63303,
63211
]
],
"success": true
}
I tried like below, but is not getting the results.I can do it using python by lopping through all the elements in row list.
But is there an easy solution in PostgreSQL(version 11).
SELECT raw_data->'rows'->0
FROM api_details
You can use JSONB_ARRAY_ELEMENTS() function such as
SELECT (j.value)->>0 AS action_name,
(j.value)->>1 AS sent_timestamp,
(j.value)->>2 AS Sent,
(j.value)->>3 AS Delivered
FROM api_details
CROSS JOIN JSONB_ARRAY_ELEMENTS(raw_data->'rows') AS j
Demo
P.S. in this case the data type of raw_data is assumed to be JSONB, otherwise the argument within the function raw_data->'rows' should be replaced with raw_data::JSONB->'rows' in order to perform explicit type casting.

Search and replace based on a dictionary

I have a json file filled with a list of data where each element has one field called url.
[
{ ...,
...,
"url": "us.test.com"
},
...
]
In a different file I have a list of mappings that I need to replace the affected url fields with, formatted like this:
us.test.com test.com
hello.com/se hello.com
...
So the end result should be:
[
{ ...,
...,
"url": "test.com"
},
...
]
Is there a way to do this in Vim or do I need to do it programmatically?
Well, I'd do this programmatically in Vim ;-) As you'll see it's quite similar to Python and many other scripting languages.
Let's suppose we have json file open. Then
:let foo = json_decode(join(getline(1, '$')))
will load json into VimScript variable. So :echo foo will show [{'url': 'us.test.com'}, {'url': 'hello.com/se'}].
Now let's switch to a "mapping" file. We're going to split all lines and make a Dictionary like that:
:let bar = {}
:for line in getline(1, '$') | let field = split(line) | let bar[field[0]] = field[1] | endfor
Now :echo bar shows {'hello.com/se': 'hello.com', 'us.test.com': 'test.com'} as expected.
To perform a substitution we do simply:
:for field in foo | let field.url = bar->get(field.url, field.url) | endfor
And now foo contains [{'url': 'test.com'}, {'url': 'hello.com'}] which is what we want. The remaining step is to write the new value into a buffer with
:put =json_encode(foo)
You could…
turn those lines in your mappings file (/tmp/mappings for illustration purpose):
us.test.com test.com
hello.com/se hello.com
...
into:
g/"url"/s#us.test.com#test.com#g
g/"url"/s#hello.com/se#hello.com#g
...
with:
:%normal Ig/"url"/s#
:%s/ /#
The idea is to turn the file into a script that will perform all those substitutions on all lines matching "url".
If you are confident that those strings are only in "url" lines, you can just do:
:%normal I%s#
:%s/ /#
to obtain:
%s#us.test.com#test.com#g
%s#hello.com/se#hello.com#g
...
write the file:
:w
and source it from your JSON file:
:source /tmp/mappings
See :help :g, :help :s, :help :normal, :help :range, :help :source, and :help pattern-delimiter.

Apache Drill: Reserved word "user" as one of the column names in json data file

I am querying a nested json file, the structure looks something like as follows
{"user_id":1234,
"text":"example text"
"first_nested":{
"field1":"dummy string 1",
"field2":"dummy string 2"
},
"user":{
"field3":"dummy string 3",
"field4":"dummy string 4"
},
"last":1}
I have a nested json structure named "user", and when query the following:
SELECT tbl.user AS us FROM dfs.`/filepath/trial.json` as tbl WHERE user_id=221
or
SELECT tbl.user.field1 AS us FROM dfs.`/filepath/trial.json` as tbl WHERE tbl.user_id=221
I get the following error:
UserRemoteException: PARSE ERROR: Encountered ". user" at line 1, column 11. Was expecting one of: "FROM" ... "," ... "AS" ... ... ... ... ... ... "." ... "NOT" ... "IN" ... "BETWEEN" ... "LIKE" ... "SIMILAR" ... "=" ... ">" ... "<" ... "<=" ... ">=" ... "<>" ... "+" ... "-" ... "*" ... "/" ... "||" ... "AND" ... "OR" ... "IS" ..
and when I simply use user with dereferencing with tbl. , the query return the name of the user who owns the current drill profile.
I can't change the name of the column from this json file, how do I get around this ?
I looked this up in the docs. Use backticks for reserved words.
Because the column alias contains the special space character, also enclose the alias in back ticks, as shown in the following example
From https://drill.apache.org/docs/lexical-structure/
In your case:
SELECT tbl.`user`.field1 AS us FROM ...

TCL to JSON : Writing JSON output using huddle in single line

Let us consider a tcl data as follows:
set arr {a {{c 1} {d {2 2 2} e 3}} b {{f 4 g 5}}}
Converted into Json format using huddle module:
set json_arr [huddle jsondump [huddle compile {dict * {list {dict d list}}} $arr]]
puts $json_arr
Json fromatted array:
{
"a": [
{"c": 1},
{
"d": [
2,
2,
2
],
"e": 3
}
],
"b": [{
"f": 4,
"g": 5
}]
}
Writing in a single line:
set json_arr [huddle jsondump [huddle compile {dict * {list {dict d list}}} $arr] {} {}]
puts $json_arr
Updated Json formatted array:
{"a":[{"c":1},{"d":[2,2,2],"e":3}],"b":[{"f":4,"g":5}]}
What is the meaning of {} {} here?
Can I use the same for single line in case of output by json and json::write module ?
The last three, optional, arguments to jsondump are offset, newline, and begin_offset. You can use those to specify strings that are to be used to format the output string. If you don’t specify them, default strings will be used.
If you do specify them, you need to follow the protocol for optional arguments, i.e. if you want to specify begin_offset, you need to specify offset and newline too, etc. In this case, offset and newline are specified to be empty strings, and begin_offset uses its default value.
Try invoking jsondump with dummy values to get an idea of how they are used:
% huddle jsondump [huddle compile {dict * {list {dict d list}}} $arr] <A> <B> <C>
{<B><C><A>"a": [<B><C><A><A>{"c": 1},<B><C><A><A>{<B><C><A><A><A>"d": [<B><C><A><A><A><A>2,<B><C><A><A><A><A>2,<B><C><A><A><A><A>2<B><C><A><A><A>],<B><C><A><A><A>"e": 3<B><C><A><A>}<B><C><A>],<B><C><A>"b": [{<B><C><A><A><A>"f": 4,<B><C><A><A><A>"g": 5<B><C><A><A>}]<B><C>}
A newline and a begin_offset string is inserted around each component, and one or more offset strings are inserted before a component to reflect the indentation level.
json::write uses the indented and aligned subcommands to customize formatting.