Logstash - Substring from CSV column

Logstash - Substring from CSV column - csv

I want to import many informations from a CSV file to Elastic Search.
My issue is I don't how can I use a equivalent of substring to select information into a CSV column.
In my case I have a field date (YYYYMMDD) and I want to have (YYYY-MM-DD).
I use filter, mutate, gsub like:
filter
{
mutate
{
gsub => ["date", "[0123456789][0123456789][0123456789][0123456789][0123456789][0123456789][0123456789][0123456789]", "[0123456789][0123456789][0123456789][0123456789]-[0123456789][0123456789]-[0123456789][0123456789]"]
}
}
But my result is false.
I can indentified my string but I don't how can I extract part of this.
My target it's to have something like:
gsub => ["date", "[0123456789][0123456789][0123456789][0123456789][0123456789][0123456789][0123456789][0123456789]","%{date}(0..3}-%{date}(4..5)-%{date}"(6..7)]
%{date}(0..3} : select from the first to the 4 characters of csv columns date

You can use ruby plugin to do conversion. As you say, you will have a date field. So, we can use it directly in ruby
filter {
ruby {
code => "
date = Time.strptime(event['date'],'%Y%m%d')
event['date_new'] = date.strftime('%Y-%m-%d')
"
}
}
The date_new field is the format you want.

First, you can use a regexp range to match a sequence, so rather than [0123456789], you can do [0-9]. If you know there will be 4 numbers, you can do [0-9]{4}.
Second, you want to "capture" parts of your input string and reorder them in the output. For that, you need capture groups:
([0-9]{4})([0-9]{2})([0-9]{2})
where parens define the groups. Then you can reference those on the right side of your gsub:
\1-\2-\3
\1 is the first capture group, etc.
You might also consider getting these three fields when you do the grok{}, and then putting them together again later (perhaps with add_field).

Related

Subquery for element of JSON column

I have a big JSON data in one column called response_return in a Postgres DB, with a response like:
{
"customer_payment":{
"OrderId":"123456789",
"Customer":{
"Full_name":"Francis"
},
"Payment":{
"AuthorizationCode":"9874565",
"Recurrent":false,
"Authenticate":false,
...
}
}
}
I tried to use Postgres functions like -> ,->> ,#> or #> to walk through headers to achieve AuthorizationCode for a query.
When I use -> in customer_payment in a SELECT, returns all after them. If I try with OrderId, it's returned NULL.
The alternatives and sources:
Using The JSON Datatype In PostgreSQL
Operator ->
Allows you to select an element based on its name.
Allows you to select an element within an array based on its index.
Can be used sequentially: ::json->'elementL'->'subelementM'->…->'subsubsubelementN'.
Return type is json and the result cannot be used with functions and operators that require a string-based datatype. But the result can be used with operators and functions that require a json datatype.
Query for element of array in JSON column
This is not helpful because I don't want filter and do not believe that need to transform to array.

If you just want to get a single attribute, you can use:
select response_return -> 'customer_payment' -> 'Payment' ->> 'AuthorizationCode'
from the_table;
You need to use -> for the intermediate access to the keys (to keep the JSON type) and ->> for the last key to return the value as a string.
Alternatively you can provide the path to the element as an array and use #>>
select response_return #>> array['customer_payment', 'Payment', 'AuthorizationCode']
from the_table;
Online example

How do I search for a string in this JSON with Python

My JSON file looks something like:
{
"generator": {
"name": "Xfer Records Serum",
....
},
"generator": {
"name: "Lennar Digital Sylenth1",
....
}
}
I ask the user for search term and the input is searched for in the name key only. All matching results are returned. It means if I input 's' only then also both the above ones would be returned. Also please explain me how to return all the object names which are generators. The more simple method the better it will be for me. I use json library. However if another library is required not a problem.
Before switching to JSON I tried XML but it did not work.

If your goal is just to search all name properties, this will do the trick:
import re
def search_names(term, lines):
name_search = re.compile('\s*"name"\s*:\s*"(.*' + term + '.*)",?$', re.I)
return [x.group(1) for x in [name_search.search(y) for y in lines] if x]
with open('path/to/your.json') as f:
lines = f.readlines()
print(search_names('s', lines))
which would return both names you listed in your example.
The way the search_names() function works is it builds a regular expression that will match any line starting with "name": " (with varying amount of whitespace) followed by your search term with any other characters around it then terminated with " followed by an optional , and the end of string. Then applies that to each line from the file. Finally it filters out any non-matching lines and returns the value of the name property (the capture group contents) for each match.

JSON numbers formatted with commas

How can I take some JSON data that contains a number and insert commas in the numbers?
Example: I fetch some JSON data from a url and can display it, it contains a number. let's say 100000. (100,000). It doesn't have a comma to better show 100,000.
language used: Angular 6 (Typescript)

There's many ways to do this, pick your poison:
Intl Number Format
var formatter = new Intl.NumberFormat();
formatter.format(number);
Reg-ex:
function addThousandsSeparator(n) {
return n.replace(/\B(?=(\d{3})+(?!\d))/g, ",")
}
Numeral.js
numeral(number).format('0,0')

Number.toLocaleString("en-US") should insert commas, the way you want it to.
Number("100000").toLocaleString("en-US")
// "100,000"

How do I search for a specific string in a JSON Postgres data type column?

I have a column named params in a table named reports which contains JSON.
I need to find which rows contain the text 'authVar' anywhere in the JSON array. I don't know the path or level in which the text could appear.
I want to just search through the JSON with a standard like operator.
Something like:
SELECT * FROM reports
WHERE params LIKE '%authVar%'
I have searched and googled and read the Postgres docs. I don't understand the JSON data type very well, and figure I am missing something easy.
The JSON looks something like this.
[
{
"tileId":18811,
"Params":{
"data":[
{
"name":"Week Ending",
"color":"#27B5E1",
"report":"report1",
"locations":{
"c1":0,
"c2":0,
"r1":"authVar",
"r2":66
}
}
]
}
}
]

In Postgres 11 or earlier it is possible to recursively walk through an unknown json structure, but it would be rather complex and costly. I would propose the brute force method which should work well:
select *
from reports
where params::text like '%authVar%';
-- or
-- where params::text like '%"authVar"%';
-- if you are looking for the exact value
The query is very fast but may return unexpected extra rows in cases when the searched string is a part of one of the keys.
In Postgres 12+ the recursive searching in JSONB is pretty comfortable with the new feature of jsonpath.
Find a string value containing authVar:
select *
from reports
where jsonb_path_exists(params, '$.** ? (#.type() == "string" && # like_regex "authVar")')
The jsonpath:
$.** find any value at any level (recursive processing)
? where
#.type() == "string" value is string
&& and
# like_regex "authVar" value contains 'authVar'
Or find the exact value:
select *
from reports
where jsonb_path_exists(params, '$.** ? (# == "authVar")')
Read in the documentation:
The SQL/JSON Path Language
jsonpath Type

Regular expression to remove the key:value from hash / json

Suppose I have following json and I want to skip the entry "data_type" from it.
{
"marketing_type":"FIT",
"controllable":"true",
"plannable":"true",
"sbm_qualified":"true",
"marginal_cost":"{:type=>\"float\", :label=>\"Marginal Cost to steer\",:unit=>\"$/MWh\", :default=>100} must be float.",
"data_type": "any_value",
"start_cost":"{:type=>\"float\", :label=>\"Start Cost\", :unit=>\"$\",:default=>0} must be float."
}
Expected output is "data_type" entry should be removed from above.

Instead of using regex and string manipulations, and if you're running at least MySQL 5.7, you can use one of the built-in JSON functions, json_remove:
update table_name set column_name = json_remove(column_name, "$.data_type")

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Logstash - Substring from CSV column - csv

You can use ruby plugin to do conversion. As you say, you will have a date field. So, we can use it directly in ruby filter { ruby { code => " date = Time.strptime(event['date'],'%Y%m%d') event['date_new'] = date.strftime('%Y-%m-%d') " } } The date_new field is the format you want.

Related

Subquery for element of JSON column

How do I search for a string in this JSON with Python

JSON numbers formatted with commas

How do I search for a specific string in a JSON Postgres data type column?

Regular expression to remove the key:value from hash / json

Categories

Resources