I'm using Flink to process the data coming from some data source (such as Kafka, Pravega etc).
In my case, the data source is Pravega, which provided me a flink connector.
My data source is sending me some JSON data as below:
{"key": "value"}
{"key": "value2"}
{"key": "value3"}
...
...
Here is my piece of code:
PravegaDeserializationSchema<ObjectNode> adapter = new PravegaDeserializationSchema<>(ObjectNode.class, new JavaSerializer<>());
FlinkPravegaReader<ObjectNode> source = FlinkPravegaReader.<ObjectNode>builder()
.withPravegaConfig(pravegaConfig)
.forStream(stream)
.withDeserializationSchema(adapter)
.build();
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStream<ObjectNode> dataStream = env.addSource(source).name("Pravega Stream");
dataStream.map(new MapFunction<ObjectNode, String>() {
#Override
public String map(ObjectNode node) throws Exception {
return node.toString();
}
})
.keyBy("word") // ERROR
.timeWindow(Time.seconds(10))
.sum("count");
As you see, I used the FlinkPravegaReader and a proper deserializer to get the JSON stream coming from Pravega.
Then I try to transform the JSON data into a String, KeyBy them and count them.
However, I get an error:
The program finished with the following exception:
Field expression must be equal to '*' or '_' for non-composite types.
org.apache.flink.api.common.operators.Keys$ExpressionKeys.<init>(Keys.java:342)
org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:340)
myflink.StreamingJob.main(StreamingJob.java:114)
It seems that KeyBy threw this exception.
Well, I'm not a Flink expert so I don't know why. I've read the source code of the official example WordCount. In that example, there is a custtom splitter, which is used to split the String data into words.
So I'm thinking if I need to use some kind of splitter in this case too? If so, what kind of splitter should I use? Can you show me an example? If not, why did I get such an error and how to solve it?
I guess you have read the document about how to specify keys
Specify keys
The example codes use keyby("word") because word is a field of POJO type WC.
// some ordinary POJO (Plain old Java Object)
public class WC {
public String word;
public int count;
}
DataStream<WC> words = // [...]
DataStream<WC> wordCounts = words.keyBy("word").window(/*window specification*/);
In your case, you put a map operator before keyBy, and the output of this map operator is a string. So there is obviously no word field in your case. If you actually want to group this string stream, you need to write it like this .keyBy(String::toString)
Or you can even implement a customized keySelector to generate your own key.
Customized Key Selector
A third party NuGet package throws an exception and in its exception message there is a an error and a JSON object:
Request failed, Message: {"Message":"Some error message"}
How can I extract the JSON from string and get the Message property?
I know that I could use Regex to format the string before passing it to deserializer or even trim the text until the first {
Is there a cleaner way to do it using Json.NET?
No.
Json.Net is built to parse JSON. If you have extra text in the string that is not JSON, the parser will not be able to make sense of it. Your best bet is to strip off the text before the first brace (and after the last brace), like you suggested in your question. You can make a helper method to do this easily:
public static string ExtractJson(string text)
{
int i = text.IndexOf('{');
int j = text.LastIndexOf('}');
return i > -1 && j > i ? text.Substring(i, j - i + 1) : null;
}
Once you've extracted the JSON, you can use Json.Net like you normally would.
Fiddle: https://dotnetfiddle.net/WoflVv
So I have some json that looks like this, which I got after taking it out of some other json by doing response.body.to_json:
{\n \"access_token\": \"<some_access_token>\",\n \"token_type\": \"Bearer\",\n \"expires_in\": 3600,\n \"id_token\": \<some_token>\"\n}\n"
I want to pull out the access_token, so I do
to_return = {token: responseJson[:access_token]}
but this gives me a
TypeError: no implicit conversion of Symbol into Integer
Why? How do I get my access token out? Why are there random backslashes everywhere?
to_json doesn't parse JSON - it does the complete opposite: it turns a ruby object into a string containing the JSON representation of that object is.
It's not clear from your question what response.body is. It could be a string, or depending on your http library it might have already been parsed for you.
If the latter then
response.body["access_token"]
Will be your token, if the former then try
JSON.parse(response.body)["access_token"]
Use with double quotes when calling access_token. Like below:
to_return = {token: responseJson["access_token"]}
Or backslashes are escaped delimiters and make sure you first parse JSON.
string json="[{\"ParentId\":\"a9764da3147845c184bd272cef6a5937\",\"Path\":\"/LMS/Cabinet/bdc2cd8e1da3451c84e332d1aa74f605\",\"CreatedBY\":\"admin\",\"IsActive\":\"Y\",\"CabinetName\":\"LMS\",\"FolderTag\":\"IT mobile computing,Comm Skills\",\"Name\":\"JAVA\",\"UpdatedBY\":\"\",\"Type\":\"Folder\",\"IsDelete\":\"N\",\"UpdatedON\":\"\",\"Id\":\"bdc2cd8e1da3451c84e332d1aa74f605\",\"CreatedON\":\"2015_09_08-11:19:50\",\"TemplateId\":\"dd42c8a71a954c1d948ef35492ee1242\" },{\"ParentId\":\"a9764da3147845c184bd272cef6a5937\",\"Path\":\"/LMS/Cabinet/3aae020c256f4dc1ab705af67bede2c7\",\"CreatedBY\":\"admin\",\"IsActive\":\"Y\",\"CabinetName\":\"LMS\",\"FolderTag\":\"IT mobile computing,Comm Skills\",\"Name\":\"Spring\",\"UpdatedBY\":\"\",\"Type\":\"Folder\",\"IsDelete\":\"N\",\"UpdatedON\":\"\",\"Id\":\"3aae020c256f4dc1ab705af67bede2c7\",\"CreatedON\":\"2015_09_04-16:58:05\",\"TemplateId\":\"dd42c8a71a954c1d948ef35492ee1242\"},{\"ParentId\":\"a9764da3147845c184bd272cef6a5937\",\"Path\":\"/LMS/Cabinet/c139b33a22a94a25bf624b94450aee3e\",\"CreatedBY\":\"admin\",\"IsActive\":\"Y\",\"CabinetName\":\"LMS\",\"FolderTag\":\"Social Skills\",\"Name\":\"SQL\",\"UpdatedBY\":\"\",\"Type\":\"Folder\",\"IsDelete\":\"N\",\"UpdatedON\":\"\",\"Id\":\"c139b33a22a94a25bf624b94450aee3e\",\"CreatedON\":\"2015_09_04-16:54:44\",\"TemplateId\":\"dd42c8a71a954c1d948ef35492ee1242\"}]";
This is my JSON string and I want to only "Name" field with comma separated into another variable.
For example, example:
string res={"JAVA","SPRING","SQL"}
What language are you trying to do this in? My first guess is javascript but i wasnt sure.
If its JavaScript, then you can get starten by turning the json string into a JavaScript object like this:
var jsonElements = JSON.parse (json);
From there its just looping and collecting the name property from each object.
Now we'll do this from C# instead
Ok, since you are doing the parsing from C# you would want this code instead:
string json = "[{\"ParentId\":\"a9764da3147845c184bd272cef6a5937\",\"Path\":\"/LMS/Cabinet/bdc2cd8e1da3451c84e332d1aa74f605\",\"CreatedBY\":\"admin\",\"IsActive\":\"Y\",\"CabinetName\":\"LMS\",\"FolderTag\":\"IT mobile computing,Comm Skills\",\"Name\":\"JAVA\",\"UpdatedBY\":\"\",\"Type\":\"Folder\",\"IsDelete\":\"N\",\"UpdatedON\":\"\",\"Id\":\"bdc2cd8e1da3451c84e332d1aa74f605\",\"CreatedON\":\"2015_09_08-11:19:50\",\"TemplateId\":\"dd42c8a71a954c1d948ef35492ee1242\" },{\"ParentId\":\"a9764da3147845c184bd272cef6a5937\",\"Path\":\"/LMS/Cabinet/3aae020c256f4dc1ab705af67bede2c7\",\"CreatedBY\":\"admin\",\"IsActive\":\"Y\",\"CabinetName\":\"LMS\",\"FolderTag\":\"IT mobile computing,Comm Skills\",\"Name\":\"Spring\",\"UpdatedBY\":\"\",\"Type\":\"Folder\",\"IsDelete\":\"N\",\"UpdatedON\":\"\",\"Id\":\"3aae020c256f4dc1ab705af67bede2c7\",\"CreatedON\":\"2015_09_04-16:58:05\",\"TemplateId\":\"dd42c8a71a954c1d948ef35492ee1242\"},{\"ParentId\":\"a9764da3147845c184bd272cef6a5937\",\"Path\":\"/LMS/Cabinet/c139b33a22a94a25bf624b94450aee3e\",\"CreatedBY\":\"admin\",\"IsActive\":\"Y\",\"CabinetName\":\"LMS\",\"FolderTag\":\"Social Skills\",\"Name\":\"SQL\",\"UpdatedBY\":\"\",\"Type\":\"Folder\",\"IsDelete\":\"N\",\"UpdatedON\":\"\",\"Id\":\"c139b33a22a94a25bf624b94450aee3e\",\"CreatedON\":\"2015_09_04-16:54:44\",\"TemplateId\":\"dd42c8a71a954c1d948ef35492ee1242\"}]";
dynamic jsonObj = (JArray)JsonConvert.DeserializeObject(json);
List<string> names = new List<string>();
foreach (JObject item in jsonObj)
{
names.Add(item["Name"].ToString());
}
Deserialize the string and cast it to a JArray. Then foreach element in the array, we look at the Name property of the element. I tested this code in visual studio and it appears to work.
I am trying to parse some JSON objects which is made just of (string,string) pairs, in order to emulate Resjson behaviour. The file I am parsing contains this.
{
"greeting":"Hello world",
"_greeting.comment":"Hello comment.",
"_greeting.source":"Original Hello",
}
Please note the last comma is incorrect, and I also used http://jsonlint.com/ to test JSON syntax. It tells me it is incorrect, as I expected. My - slightly modified - code is :
string path = #"d:\resjson\example.resjson";
string jsonText = File.ReadAllText(path);
IDictionary<string, string> dict;
try
{
dict = JsonConvert.DeserializeObject<IDictionary<string, string>>(jsonText);
}
catch(Exception ex)
{
// code never reaches here
}
My above code returns the IDictionary with the 3 keys as if the formatting was correct. If I serialize back, the string obtained is without the last comma.
My questions are :
Is Newtonsoft.Json so permissive that it allows users slight errors ?
If so, can I set the permissiveness so that it is more strict ?
Is there a way to check if a string is valid JSON format, using
Newtonsoft.Json with and/or without the permissiveness?