Google Sheet Query and ImportHTML Function show numerical values as text

Google Sheet Query and ImportHTML Function show numerical values as text - html

I'm importing some data using the formula below but the numerical values appear as =1599 (for example) and are being treated as text (ie cannot use them in a formula).Does anyone know how to substitute the "=" to "" in the table? The numerical values are in column H.
={QUERY(IMPORTHTML("https://1234567.website.com","table",0), "where Col1 is not null",1)}
I tried wrapping in:
SUBSTITUTE( ... ,"=","")
ARRAYFORMULA(SUBSTITUTE( ... , "=","")
TO_PURE_NUMBER(
Nothing works. Is there a way to apply one of these solutions only to the columns with numerical values?

Try iferror() and regexextract(), like this:
=arrayformula(
lambda(
data,
iferror(value(regexextract(data, "[-.\d%]+")), data)
)(
importhtml("https://1234567.website.com", "table", 0)
)
)

Related

regex_replace function skips anything coming after a NULL value

In my hive table "ticket_full" I have a json type column named "service_id" that I would like to extract in 3 columns, which is like this
[{"position":"1","typeid":"ROUTNAME","value":"PWAW13197"},{"position":"2","typeid":"CDCNAME","value":null},{"position":"3","typeid":"SVCNAME","value":"Business"},{"position":"4","typeid":"USID","value":"FI021MLQE4"}]
[{"position":"1","typeid":"ROUTNAME","value":"KHLA30076"},{"position":"2","typeid":"CDCNAME","value":"eff-e-rjh-sw-cs2"},{"position":"3","typeid":"SVCNAME","value":"Managed LAN"},{"position":"4","typeid":"USID","value":"SA00BNGH0E"}]
[{"position":"1","typeid":"NUMLIAPTT","value":"0492212984"},{"position":"2","typeid":null,"value":null},{"position":"3","typeid":null,"value":null},{"position":"4","typeid":null,"value":null}]
I used the code below:
SELECT get_json_object(single_json_table.identifiant_produit, '$.position') AS position,
get_json_object(single_json_table.identifiant_produit, '$.typeid') AS typeid,
get_json_object(single_json_table.identifiant_produit, '$.value') AS value
FROM
(SELECT explode(split(regexp_replace(substr(serviceid, 2, length(serviceid)-2),
'"},\\{"', '"},,,,{"'), ',,,,') ) as identifiant_produit
FROM ticket_full) single_json_table
it works but every time there is a value at NULL, it ignores what follows and goes to the next field:
example:
Does anyone know how to fix this please ?

It is because null has no double-quotes and you are replacing this '"},\\{"' with this '"},,,,{"'
Try to remove double-quote before } in the regex pattern and replacement string accordingly, then it will work with quoted values and nulls also:
split(regexp_replace(substr(serviceid, 2, length(serviceid)-2),
'},\\{"', '},,,,{"'), ',,,,')

Replace template smart tags <<tag>> to [tag] in mysql

I have an table name templateType, It has column name Template_Text.
The Template have many smart tags <> to [tag] using mysql, and I need to replace << to [ and >> with ].
Edit from OP's comments:
It is an template with large text and having multiple smart tags. As example : " I <<Fname>> <<Lname>>, <<UserId>> <<Designation>> of xyz organization, Proud to announce...."
Here I need to replace these << with [ and >> with ], so it will look like
" [Fname] [Lname], [UserId] ...."

Based on your comments, your MySQL version does not support Regex_Replace() function. So, a generic solution is not feasible.
Assuming that your string does not contain additional << and >>, other than following the <<%>> format, we can use Replace() function.
I have also added a WHERE condition, so that we pick only those rows which match our given substring criteria.
Update templateType
SET Template_Text = REPLACE(REPLACE(Template_Text, '<<', '['), '>>', ']')
WHERE Template_Text LIKE '%<<%>>%'
In case the problem is further complex, you may get some ideas from this answer: https://stackoverflow.com/a/53286571/2469308

A couple of replace calls should work:
SELECT REPLACE(REPLACE(template_text, '<<', '['), '>>', '])
FROM template_type

check empty or na values in columns in R

I have a dataframe "d1" from a big MySQL table. I need find there an unused columns (which contains only NA or empty strings).
(see question Find columns with all missing values ).
This seems to work fine:
allmisscols <- apply(d1,2, function(x)all(is.na(x)));
colswithallmiss <-names(allmisscols[allmisscols>0]);
cat( colswithallmiss,sep="\n");
...
allmisscols <- apply(d1,2, function(x)all(x==''));
colswithallmiss <-names(allmisscols[allmisscols>0]);
cat( colswithallmiss,sep="\n");
...
although the second one gives also "NA" among the column names; i don't understand why.
But when I try to combine them:
allmisscols <- apply(d1,2, function(x)all(is.na(x)||x=='') );
colswithallmiss <-names(allmisscols[allmisscols>0]);
print("the columns with all values missing");
print(colswithallmiss);
I see a column in result that actually contain a value in my table!
The same gives following:
library(stringr);
sapply(d1, function(x)all(any(is.na(x)||(str_trim(x)==""))))
So my questions are:
Why I've got such unexpected results?
How can I get the list of column which contains only empty OR N/A values?

Try this:
allmisscols <- sapply(dt, function(x) all(is.na(x) | x == '' ))
Note: You've used OR as double '||' trying making it a single one. Read this SO post: Boolean operators && and ||

SQL - How do I replace/modify strings in multiple rows?

I have a lot of rows in database where column "info" is ie. "Size, inch: 12x13<br>Material: paper<br>Amount: 100pcs". Now I need to find/select all rows that has this string part "Size, inch:" and replace/fix those rows' colums to ie. "Size: 12x13 inch<br> Material: paper<br>Amount: 100pcs ".
How in the world I write a sql statement for this? Do I need some magic regexp for sql? How do I replace and modify parts of a string in multiple rows?
EDIT: The numbers can be anything (ie. 12x13 or 44x55 or 77x88x99 etc.) So there would have to be some kind of wildcard for the numbers, perhaps?
I need to change "Size, inch: 'anynumbershere' 'anythingafter the numbers'" to "Size: 'anynumbershere' inch 'anythingafter'".

update mytable set info=REPLACE ( info , 'Size, inch: 12x13 ', 'Size: 12x13 inch' )
where info like '%inch:12X13%'

Perhaps this will work:
update table t
info = replace(replace(info, 'Size inch:', 'Size:'), '<br>Material', ' inch<br>Material')
where info like 'Size inch:%<br>Material%';
Note: this assumes that 'Size inch' and '<br>Material' only appear once in each string.

Logstash - Substring from CSV column

I want to import many informations from a CSV file to Elastic Search.
My issue is I don't how can I use a equivalent of substring to select information into a CSV column.
In my case I have a field date (YYYYMMDD) and I want to have (YYYY-MM-DD).
I use filter, mutate, gsub like:
filter
{
mutate
{
gsub => ["date", "[0123456789][0123456789][0123456789][0123456789][0123456789][0123456789][0123456789][0123456789]", "[0123456789][0123456789][0123456789][0123456789]-[0123456789][0123456789]-[0123456789][0123456789]"]
}
}
But my result is false.
I can indentified my string but I don't how can I extract part of this.
My target it's to have something like:
gsub => ["date", "[0123456789][0123456789][0123456789][0123456789][0123456789][0123456789][0123456789][0123456789]","%{date}(0..3}-%{date}(4..5)-%{date}"(6..7)]
%{date}(0..3} : select from the first to the 4 characters of csv columns date

You can use ruby plugin to do conversion. As you say, you will have a date field. So, we can use it directly in ruby
filter {
ruby {
code => "
date = Time.strptime(event['date'],'%Y%m%d')
event['date_new'] = date.strftime('%Y-%m-%d')
"
}
}
The date_new field is the format you want.

First, you can use a regexp range to match a sequence, so rather than [0123456789], you can do [0-9]. If you know there will be 4 numbers, you can do [0-9]{4}.
Second, you want to "capture" parts of your input string and reorder them in the output. For that, you need capture groups:
([0-9]{4})([0-9]{2})([0-9]{2})
where parens define the groups. Then you can reference those on the right side of your gsub:
\1-\2-\3
\1 is the first capture group, etc.
You might also consider getting these three fields when you do the grok{}, and then putting them together again later (perhaps with add_field).

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Google Sheet Query and ImportHTML Function show numerical values as text - html

Try iferror() and regexextract(), like this: =arrayformula( lambda( data, iferror(value(regexextract(data, "[-.\d%]+")), data) )( importhtml("https://1234567.website.com", "table", 0) ) )

Related

regex_replace function skips anything coming after a NULL value

Replace template smart tags <<tag>> to [tag] in mysql

check empty or na values in columns in R

SQL - How do I replace/modify strings in multiple rows?

Logstash - Substring from CSV column

Categories

Resources