So I have a .CSV file that contains dataset information, the data seems to be described in JSON. I want to read it with MatLab. One line example(7000 total) of the data:
imagename.jpg,"[[{""name"":""nose"",""position"":[2911.68,1537.92]},{""name"":""left eye"",""position"":[3101.76,544.32]},{""name"":""right eye"",""position"":[2488.32,544.32]},{""name"":""left ear"",""position"":null},{""name"":""right ear"",""position"":null},{""name"":""left shoulder"",""position"":null},{""name"":""right shoulder"",""position"":[190.08,1270.08]},{""name"":""left elbow"",""position"":null},{""name"":""right elbow"",""position"":[181.44,3231.36]},{""name"":""left wrist"",""position"":[2592,3093.12]},{""name"":""right wrist"",""position"":[2246.4,3965.76]},{""name"":""left hip"",""position"":[3006.72,3360.96]},{""name"":""right hip"",""position"":[155.52,3412.8]},{""name"":""left knee"",""position"":null},{""name"":""right knee"",""position"":null},{""name"":""left ankle"",""position"":[2350.08,4786.56]},{""name"":""right ankle"",""position"":[1460.16,5019.84]}]]","[[{""segment"":[[0,17.28],[933.12,5175.36],[0,5166.72],[0,2306.88]]}]]",https://imageurl.jpg,
If I use the Import functionlity/tool, I am able separate the data in four colums using the , as delimiter:
Image File Name,Key Points,Segmentation,Image URL,
imagename.jpg,
"[[{""name"":""nose"",""position"":[2911.68,1537.92]},{""name"":""left eye"",""position"":[3101.76,544.32]},{""name"":""right eye"",""position"":[2488.32,544.32]},{""name"":""left ear"",""position"":null},{""name"":""right ear"",""position"":null},{""name"":""left shoulder"",""position"":null},{""name"":""right shoulder"",""position"":[190.08,1270.08]},{""name"":""left elbow"",""position"":null},{""name"":""right elbow"",""position"":[181.44,3231.36]},{""name"":""left wrist"",""position"":[2592,3093.12]},{""name"":""right wrist"",""position"":[2246.4,3965.76]},{""name"":""left hip"",""position"":[3006.72,3360.96]},{""name"":""right hip"",""position"":[155.52,3412.8]},{""name"":""left knee"",""position"":null},{""name"":""right knee"",""position"":null},{""name"":""left ankle"",""position"":[2350.08,4786.56]},{""name"":""right ankle"",""position"":[1460.16,5019.84]}]]",
"[[{""segment"":[[0,17.28],[933.12,5175.36],[0,5166.72],[0,2306.88]]}]]",
https://imageurl.jpg,
But I have truble trying to use the tool to do further decomposition of the data. Of corse the ideal would be to separate the data in a code.
I hope someone can orientate me on how to or the tools I need to use. I have seen other questions, but they don't seem to fit my particular case.
Thank you very much!!
You can read a JSON file and store it in a MATLAB structure using the following command structure1 = matlab.internal.webservices.fromJSON(json_string)
You can create a JSON string from a MATLAB structure using the following command json_string= matlab.internal.webservices.toJSON(structure1)
JSONlab is what you want. It has a 'loadjson' function which inputs a char array of JSON data and returns a struct with all the data
I have category information in the training input data, I am wondering what's the best way to normalize it.
The category information is like "city", "gender" and etc.
I'd like to use Keras to handle the process.
Scikitlearn has a preprocessing library with functions to normalize or scale your data.
This video gives an example for how to preprocess data that will be used for training a model with Keras. The preprocessing here is done with the library mentioned above.
As shown in the video, with the use of Scikitlearn's MinMaxScaler class, you can specify a range that you want your data to be transformed into, and then fit your data to that range using the MinMaxScaler.fit_transform() function.
I have a problem to train for FCN with caffe.
I prepared my images(original image data, segmented image data).
eg).jpg.
Then, I want to convert my data to lmdb using convert_imageset.exe, but its format is image(array)_label(int). But my data is image(array)_label(array).
How to convert own images for FCN?
Edit convert_imageset.exe sources to handle your desired format - it's in caffe/tools/convert_imageset.cpp
I am interested in data mining and I am writing my thesis about it. For my thesis I want to use yelp's data challenge's data set, however i can not open it since it is in json format and almost 2 gb. In its website its been said that the dataset can be opened in phyton using mrjob, but I am also not very good with programming. I searched online and looked some of the codes yelp provided in github however I couldn't seem to find an article or something which explains how to open the dataset, clearly.
Can you please tell me step by step how to open this file and maybe how to convert it to csv?
https://www.yelp.com.tr/dataset_challenge
https://github.com/Yelp/dataset-examples
data is in .tar format when u extract it again it has another file,rename it to .tar and then extract it.you will get all the json files
yes you can use pandas. Take a look:
import pandas as pd
# read the entire file into a python array
with open('yelp_academic_dataset_review.json', 'rb') as f:
data = f.readlines()
# remove the trailing "\n" from each line
data = map(lambda x: x.rstrip(), data)
data_json_str = "[" + ','.join(data) + "]"
# now, load it into pandas
data_df = pd.read_json(data_json_str)
Now 'data_df' contains the yelp data ;)
Case, you want convert it directly to csv, you can use this script
https://github.com/Yelp/dataset-examples/blob/master/json_to_csv_converter.py
I hope it can help you
To process huge json files, use a streaming parser.
Many of these files aren't a single json, but a stream of jsons (known as "jsons format"). Then a regular json parser will consider everything but the first entry to be junk.
With a streaming parser, you can start reading the file, process parts, and wrote them to the desired output; then continue writing.
There is no single json-to-csv conversion.
Thus, you will not find a general conversion utility, you have to customize the conversion for your needs.
The reason is that a JSON is a tree but a CSV is not. There exists no ultimative and efficient conversion from trees to table rows. I'd stick with JSON unless you are always extracting only the same x attributes from the tree.
Start coding, to become a better programmer. To succeed with such amounts of data, you need to become a better programmer.