I read the doc of jsx which removed the post_decode option because it prevented the evolution of the jsx.
So what's the option now if I want what post_decode can do. For example, I can have a function to convert binary value into string value with this option.
F = fun(E) when is_binary(E) -> binary_to_list(E) end,
jsx:decode(BinaryJsonString, [{post_decode,F}]).
How do I do that now?
Related
i have extracted the fix message as below from Unix server and now need to convert this message into JSON. how can we do this?
8=FIXT.1.1|9=449|11=ABCD1|35=AE|34=1734|49=REPOFIXUAT|52=20140402-11:38:34|56=TR_UAT_VENDOR|1128=8|15=GBP|31=1.7666|32=50000000.00|55=GBP/USD|60=20140402-11:07:33|63=B|64=20140415|65=OR|75=20140402|150=F|167=FOR|194=1.7654|195=0.0012|460=4|571=7852455|1003=2 USD|1056=88330000.00|1057=N|552=1|54=2|37=20140402-12:36:48|11=NOREF|453=4|448=ZERO|447=D|452=3|448=MBY2|447=D|452=1|448=LMEB|447=D|452=16|448=DOR|447=D|452=11|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|10=111
Note: I tried to make this a comment on the answer provided by #selbie, but the text was too long for a comment, so I am making it an answer.
#selbie's answer will work most of the time, but there are two edge cases in which it could fail.
First, in a tag=value field where the value is of type STRING, it is legal for value to contain the = character. To correctly cope with this possibility, the Java statement:
pair = item.split("=");
should be changed to:
pair = item.split("=", 2);
The second edge case is when there are a pair of fields, the first of which is of type LENGTH and the second is of type DATA. In this case, the value of the LENGTH fields specifies the length of the DATA field (without the delimiter), and it is legal for the value of the DATA field to contain the delimiter character (ASCII character 1, but denoted as | in both the question and Selbie's answer). Selbie's code cannot be modified in a trivial manner to deal with this edge case. Instead, you will need a more complex algorithm that consults a FIX data dictionary to determine the type of each field.
Since you didn't tag your question for any particular programming language, I'll give you a few sample solutions:
In javascript:
let s = "8=FIXT.1.1|9=449|11=ABCD1|35=AE|34=1734|49=REPOFIXUAT|52=20140402-11:38:34|56=TR_UAT_VENDOR|1128=8|15=GBP|31=1.7666|32=50000000.00|55=GBP/USD|60=20140402-11:07:33|63=B|64=20140415|65=OR|75=20140402|150=F|167=FOR|194=1.7654|195=0.0012|460=4|571=7852455|1003=2 USD|1056=88330000.00|1057=N|552=1|54=2|37=20140402-12:36:48|11=NOREF|453=4|448=ZERO|447=D|452=3|448=MBY2|447=D|452=1|448=LMEB|447=D|452=16|448=DOR|447=D|452=11|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|10=111"
let obj = {};
items = s.split("|")
items.forEach(item=>{
let pair = item.split("=");
obj[pair[0]] = pair[1];
});
let jsonString = JSON.stringify(obj);
Python:
import json
s = "8=FIXT.1.1|9=449|11=ABCD1|35=AE|34=1734|49=REPOFIXUAT|52=20140402-11:38:34|56=TR_UAT_VENDOR|1128=8|15=GBP|31=1.7666|32=50000000.00|55=GBP/USD|60=20140402-11:07:33|63=B|64=20140415|65=OR|75=20140402|150=F|167=FOR|194=1.7654|195=0.0012|460=4|571=7852455|1003=2 USD|1056=88330000.00|1057=N|552=1|54=2|37=20140402-12:36:48|11=NOREF|453=4|448=ZERO|447=D|452=3|448=MBY2|447=D|452=1|448=LMEB|447=D|452=16|448=DOR|447=D|452=11|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|10=111"
obj = {}
for item in s.split("|"):
pair = item.split("=")
obj[pair[0]] = pair[1]
jsonString = json.dumps(obj)
Porting the above solutions to other languages is an exercise for yourself. There's comments below about semantic ordering and handling cases where the the = or | chars are part of the content. That's on you to explore if you need to support those scenarios.
I want to convert a string object from json file into integer using pyspark.
df1.select(df1["`result.price`"]).dtypes
Out[15]: [('result.price', 'string')]
df1=df1.withColumn(df1.select(df1["`result.price`"]),F.col(df1.select(df1["`result.price`"])).cast(T.IntegerType()))
'DataFrame' object has no attribute '_get_object_id'
If you want to modify inline:
Since you are trying to modify the data type of nested struct field, I think you need to apply the new StructType.
Take a look at this https://stackoverflow.com/a/63270808/2956135
If you are okay with extracting to a different column:
df1 = df1.withColumn('price', F.col('result.price').cast(T.IntegerType()))
TL;DR
Why your line gives an error?
There is a few mistakes in this syntax.
df1 = df1.withColumn(df1.select(df1["`result.price`"]),F.col(df1.select(df1["`result.price`"])).cast(T.IntegerType()))
First, 1st argument of withColumn has to be string of a column name that you want to save as.
Second, F.col's argument has to be string of a column name or reference to the column.
So, this syntax should not throw an error, however, the casted value is saved to the new column.
df1 = df1.withColumn('result.price', F.col('result.price').cast(T.IntegerType()))
When practicing with Octave I created a variable with the name my_name = ["Andrew"] and upon asking Octave to interpret whether it was a string it outputted a '0'. Again when using the typeinfo(my_name) I got ans = string. Why am I getting this sort of output?
octave:47> my_name = ["Andrew"]
my_name = Andrew
octave:48> isstring(my_name)
ans = 0
octave:49> typeinfo(my_name)
ans = string
According to the documentation (emphasis mine):
isstring (s)
Return true if s is a string array.
A string array is a data type that stores strings (row vectors of characters) at each element in the array. It is distinct from character arrays which are N-dimensional arrays where each element is a single 1x1 character. It is also distinct from cell arrays of strings which store strings at each element, but use cell indexing ‘{}’ to access elements rather than string arrays which use ordinary array indexing ‘()’.
Programming Note: Octave does not yet implement string arrays so this function will always return false.
That is, isstring will always return false (or 0), no matter what the input is.
You should use ischar to determine if the input is a character array (==string).
I am trying to read some data into julia into a data frame to work with it. A minimal example of the .csv file could look like this:
A; B; C; D
ab; 1,23; 4; 9,2
ab; 3,4; 7; 1,1
ba; 6; 2,3; 8,6
I load the following to packages and read the data:
using DataFrames
using CSV
d = CSV.read( "test.csv", delim=";")
Julia recognizes the following types:
eltypes(d)
CategoricalArrays.CategoricalString{UInt32}
String
String
String
How could I now turn whole columns to floats with the comma replaced by a dot? My first idea was to use:
float(d[1,2])
But I did not find an option to tell julia to replace the comma with a dot.
My next idea was to first replace the comma and then convert it:
float(replace(d[1,2], ",", "."))
That works fine on a single cell but not on a whole column:
float(replace(d[:,2], ",", "."))
MethodError: no method matching
replace(::WeakRefStrings.WeakRefStringArray{WeakRefString{UInt8},1,Union{}},
::String, ::String)
I also tried:
d = CSV.read( "test.csv", delim=";", decimal=",")
which also just gives an error ...
Any ideas how to handle this problem and how to efficiently read the data into julia?
Thanks a lot!
Best regards.
One straightforward way is to read the file to string, replace the comma decimal separators by dots and then create the DataFrame from it:
s = replace(readstring("test.csv"), ",", ".")
CSV.read(IOBuffer(s); delim=';', types=[String, Float64, Float64, Float64])
Note that you can use the types keyword to specifiy the column types (it will then implicitly parse the string entries).
EDIT: According to this github issue the CSV.jl's read method supports a decimal keyword (from version v0.2.0 on) which allows you to do
CSV.read("test.csv"; delim=';', decimal=',', types=[String, Float64, Float64, Float64])
EDIT: Removed hint to alternatively use readtable from DataFrames.jl because it seems to be deprecated in favor of CSV.read.
I have a lua script, which simplified is like this:
local item = {};
local id = redis.call("INCR", "counter");
item["id"] = id;
item["data"] = KEYS[1]
redis.call("SET", "item:" .. id, cjson.encode(item));
return cjson.encode(item);
KEYS[1] is a stringified json object:
JSON.stringify({name : 'some name'});
What happens is that because I'm using cjson.encode to add the item to the set, it seems to be getting stringified twice, so the result is:
{"id":20,"data":"{\"name\":\"some name\"}"}
Is there a better way to be handling this?
First, regardless your question, you're using KEYS the wrong way and your script isn't written according to the guidelines. You should not generate key names in your script (i.e. call SET with "item:" .. id as a keyname) but rather use the KEYS input array to declare any keys involved a priori.
Secondly, instead of passing the stringified string with KEYS, use the ARGV input array.
Thirdly, you can do item["data"] = json.decode(ARGV[1]) to avoid the double encoding.
Lastly, perhaps you should learn about Redis' Hash data type - it may be more suitable to your needs.