Convert CSVwriter to inputstream

Convert CSVwriter to inputstream - csv

I have the following code where I want to write a list of objects onto a csv where I have defined the attributes and items. I want to convert the writer into a input stream so I read the values and do some performed computations. I also want to store this s3 file in a datastore like Amazon S3.
How do I convert the writer into a inputstream. I see no defined api. Can I read the file somehow like CSVReader reader = new CSVReader(csvWriter)?
public CSVWriter convertModelToObject(List attributes, final Class classType) throws IOException {
CSVWriter writer = new CSVWriter(new FileWriter("yourfile.csv"), com.opencsv.CSVParser.DEFAULT_SEPARATOR,
com.opencsv.CSVParser.DEFAULT_QUOTE_CHARACTER);
BeanToCsv bean = new BeanToCsv();
HeaderColumnNameMappingStrategy<T> mappingStrategy = new HeaderColumnNameMappingStrategy<>();
mappingStrategy.setType(classType);
bean.write(mappingStrategy, writer, attributes);
return writer;

Consider replacing the FileWriter you are using with a PipedWriter, creating it with a PipedReader that you would use when creating the CSVReader. You can find an example of the PipedReader Writer here.

Yes, you can.
The solution is to use InputStreamReader to read the file and pass that stream to Buffered reader and read line by line or as you want.
You can refer to this for more methods: https://www.geeksforgeeks.org/different-ways-reading-text-file-java/

Related

How to create a custom POJO for Apache Flink

I'm using Flink to process some JSON-format data coming from some Data Source.
For now, my process is quite simple: extract each element from the JSON-format data and print them into log file.
Here is my piece of code:
// create proper deserializer to deserializer the JSON-format data into ObjectNode
PravegaDeserializationSchema<ObjectNode> adapter = new PravegaDeserializationSchema<>(ObjectNode.class, new JavaSerializer<>());
// create connector to receive data from Pravega
FlinkPravegaReader<ObjectNode> source = FlinkPravegaReader.<ObjectNode>builder()
.withPravegaConfig(pravegaConfig)
.forStream(stream)
.withDeserializationSchema(adapter)
.build();
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStream<ObjectNode> dataStream = env.addSource(source).name("Pravega Stream");
dataStream.???.print();
Saying that the data coming from Pravega is like this: {"name":"titi", "age":18}
As I said, for now I simply need to extract name and age and print them.
So how could I do this?
As my understanding, I need to make some customized codes at ???. I might need to create a custom POJO class which contains ObjectNode. But I don't know how. I've read the official doc of Flink and also tried to google about how to create a custom POJO for Flink but I can't still figure out clearly.
Could you please show me an example?

Why don't You simply use something more meaningful instead of JavaSerializer? Perhaps something from here.
You could then create a POJO with the fields you want to use and simply deserialize JSON data to Your POJO instead of ObjectNode
Also, if there is some specific reason that You need to have ObjectNode on deserialization then You can simply do something like :
//I assume You have created the class named MyPojo
dataStream.map(new MapFunction<ObjectNode, MyPojo>() {
ObjectMapper mapper = new ObjectMapper();
#Override
public MyPojo map(final ObjectNode value) throws Exception {
mapper.readValue(value.asText(), MyPojo.class)
}
})

Unmarshalling with Jackson "The Json input stream must start with an array of Json objects"

I'm getting an error when unmarshalling files that only contain a single JSON object: "IllegalStateException: The Json input stream must start with an array of Json objects"
I can't find any workaround and I don't understand why it has to be so.
#Bean
public ItemReader<JsonHar> reader(#Value("file:${json.resources.path}/*.json") Resource[] resources) {
log.info("Processing JSON resources: {}", Arrays.toString(resources));
JsonItemReader<JsonHar> delegate = new JsonItemReaderBuilder<JsonHar>()
.jsonObjectReader(new JacksonJsonObjectReader<>(JsonHar.class))
.resource(resources[0]) //FIXME had to force this, but fails anyway because the file is "{...}" and not "[...]"
.name("jsonItemReader")
.build();
MultiResourceItemReader<JsonHar> reader = new MultiResourceItemReader<>();
reader.setDelegate(delegate);
reader.setResources(resources);
return reader;
}
I need a way to unmarshall single object files, what's the point in forcing arrays (which I won't have in my use case)??

I don't understand why it has to be so.
The JsonItemReader is designed to read an array of objects because batch processing is usually about handling data sources with a lot of items, not a single item.
I can't find any workaround
JsonObjectReader is what you are looking for: You can implement it to read a single json object and use it with the JsonItemReader (either at construction time or using the setter). This is not a workaround but a strategy interface designed for specific use cases like yours.

Definitely not ideal #thomas-escolan. As #mahmoud-ben-hassine pointed, ideal would be to code a custom reader.
In case some new SOF users stumble on this question, I leave here a code example on how to do it

Though this may not be ideal, this is how I handled the situation:
#Bean
public ItemReader<JsonHar> reader(#Value("file:${json.resources.path}/*.json") Resource[] resources) {
log.info("Processing JSON resources: {}", Arrays.toString(resources));
JsonItemReader<JsonHar> delegate = new JsonItemReaderBuilder<JsonHar>()
.jsonObjectReader(new JacksonJsonObjectReader<>(JsonHar.class))
.resource(resources[0]) //DEBUG had to force this because of NPE...
.name("jsonItemReader")
.build();
MultiResourceItemReader<JsonHar> reader = new MultiResourceItemReader<>();
reader.setDelegate(delegate);
reader.setResources(Arrays.stream(resources)
.map(WrappedResource::new) // forcing the bride to look good enough
.toArray(Resource[]::new));
return reader;
}
#RequiredArgsConstructor
static class WrappedResource implements Resource {
#Delegate(excludes = InputStreamSource.class)
private final Resource resource;
#Override
public InputStream getInputStream() throws IOException {
log.info("Wrapping resource: {}", resource.getFilename());
InputStream in = resource.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(in, UTF_8));
String wrap = reader.lines().collect(Collectors.joining())
.replaceAll("[^\\x00-\\xFF]", ""); // strips off all non-ASCII characters
return new ByteArrayInputStream(("[" + wrap + "]").getBytes(UTF_8));
}
}

How to directly convert MongoDB Document do Jackson JsonNode in Java

I would like to store a MongoDB Document (org.bson.Document) as a Jackson JsonNode file type. There is a outdated answer to this problem here, inspired by this I was able to succesfully parse the Document with
ObjectMapper mapper = new ObjectMapper();
...
JonNode jsonData = mapper.readTree(someBsonDocument.toJson());
In my understanding this will:
Convert the Document to string
Parse the string and create a JsonNode object
I noticed there is some support for MongoDB/BSON for the Jackson Project - jackson-datatype-mongo and BSON for Jackson, but I can not figure out how to use them to do the conversion more efficiently.

I was able to figure-out some solution using bson4jackson:
public static InputStream documentToInputStream(final Document document) {
BasicOutputBuffer outputBuffer = new BasicOutputBuffer();
BsonBinaryWriter writer = new BsonBinaryWriter(outputBuffer);
new DocumentCodec().encode(writer, document, EncoderContext.builder().isEncodingCollectibleDocument(true).build());
return new ByteArrayInputStream(outputBuffer.toByteArray());
}
public static JsonNode documentToJsonNode(final Document document) throws IOException {
ObjectMapper mapper = new ObjectMapper(new BsonFactory());
InputStream is = documentToInputStream(document);
return mapper.readTree(is);
}
I am not sure if this is the most efficient way, I am assuming it is still better solution than converting BSOn to String and parsing that string. There is an open Ticket in the mongoDB JIRA for adding conversion from Document, DBObject and BsonDocument to toBson and vice versa, which would simplify the whole process a lot.

Appreciate this isn't what the OP asked for - but might be helpful to some. I've managed to do this in reverse using MongoJack. The key thing is to use the JacksonEncoder which can turn any Json-like object into a Bson object. Then use BsonDocumentWriter to write it to a BsonDocument instance.
#Test
public void writeBsonDocument() throws IOException {
JsonNode jsonNode = new ObjectMapper().readTree("{\"wibble\": \"wobble\"}");
BsonDocument document = new BsonDocument();
BsonDocumentWriter writer = new BsonDocumentWriter(document);
JacksonEncoder transcoder =
new JacksonEncoder(JsonNode.class, null, new ObjectMapper(), UuidRepresentation.UNSPECIFIED);
var context = EncoderContext.builder().isEncodingCollectibleDocument(true).build();
transcoder.encode(writer,jsonNode,context);
Assertions.assertThat(document.toJson()).isEqualTo("{\"wibble\": \"wobble\"}");
}

jackson-dataformat-csv does not ignore unknown properties

Trying to parse .csv file with jackson-dataformat-csv. File contains a lot of columns not relevant for my program.
Tried to use #JsonIgnoreProperties(ignoreUnknown = true) on my data class,
and csvMapper.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES), but neither works, and application throws exception:
com.fasterxml.jackson.databind.RuntimeJsonMappingException: Too many entries: expected at most 2 (value #2 (17 chars) "policy_issue_date")
at [Source: (com.fasterxml.jackson.dataformat.csv.impl.UTF8Reader); line: 1, column: 37]
at com.fasterxml.jackson.databind.MappingIterator.next(MappingIterator.java:194)
at pl.polins.readers.oc.OcPolicyCsvReader.readNext(OcPolicyCsvReader.kt:25)
at pl.polins.readers.oc.OcPolicyCsvReaderTest.should read PolicyCsv from .csv file(OcPolicyCsvReaderTest.groovy:19)
Caused by: com.fasterxml.jackson.dataformat.csv.CsvMappingException: Too many entries: expected at most 2 (value #2 (17 chars) "policy_issue_date")
at [Source: (com.fasterxml.jackson.dataformat.csv.impl.UTF8Reader); line: 1, column: 37]
at com.fasterxml.jackson.dataformat.csv.CsvMappingException.from(CsvMappingException.java:23)
at com.fasterxml.jackson.dataformat.csv.CsvParser._reportCsvMappingError(CsvParser.java:1210)
at com.fasterxml.jackson.dataformat.csv.CsvParser._handleExtraColumn(CsvParser.java:965)
at com.fasterxml.jackson.dataformat.csv.CsvParser._handleNextEntry(CsvParser.java:826)
at com.fasterxml.jackson.dataformat.csv.CsvParser.nextToken(CsvParser.java:580)
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:418)
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1266)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:325)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:159)
at com.fasterxml.jackson.databind.MappingIterator.nextValue(MappingIterator.java:277)
at com.fasterxml.jackson.databind.MappingIterator.next(MappingIterator.java:192)
... 2 more
Is there any solution to ignore unwanted columns in csv?

Found solution:
csvMapper.enable(CsvParser.Feature.IGNORE_TRAILING_UNMAPPABLE)

This worked for me:
csvMapper.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES);

Introduction
For the sake of comprehensibility, here is a simple (Java) example that:
read a CSV-file from an InputStream (jackson-dataformat-csv
dependency)
map its content to a list of objects (jackson-core dependency)
CSV file content
Let be data.csv a CSV file with the following data:
a;b;c
1;2;0.5
3;4;
Java Class Data with a missing attributes
The class MyModel represents a data class with the following attributes:
private Long a;
private Integer b;
Note the attribute c is missing, hence the parser will have to ignore it.
Read the CSV content and map into a list of objects
So the CsvMapper can be coupled with the Jackson object mapper to read records from a CSV file to a list of objects, the whole in a convenient method:
<U> List<U> mapRecordsToObjects(InputStream inputStream, Class<U> encodingType) {
CsvMapper csvMapper = new CsvMapper();
CsvSchema bootstrapSchema = CsvSchema.emptySchema() //
.withHeader() //
.withColumnSeparator(";");
ObjectReader reader = csvMapper.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES) //
.readerFor(encodingType) //
.with(bootstrapSchema);
MappingIterator<U> iterator;
try {
iterator = reader.readValues(inputStream);
} catch (IOException e) {
throw new IllegalStateException(String.format("could not access file [%s]", this.source), e);
}
List<U> results = new ArrayList<>();
iterator.forEachRemaining(results::add);
return results;
}
Finally, let's call this method:
List<MyModel> result = mapRecordsToObjects(fileInputStream, MyModel.class);
In order to read the file, you will just need to initialise the InputStream.
Deserialization feature
In the documentation, the class DeserializationFeature has the following description:
Enumeration that defines simple on/off features that affect the way
Java objects are deserialized from JSON
In this enumeration class, there are many feature with a default state (sometimes per default enabled, sometimes disabled). The feature FAIL_ON_UNKOWN_PROPERTIES is disable per default can can be enabled as shown in the example. In its description, one can read:
Feature that determines whether encountering of unknown properties
(ones that do not map to a property, and there is no "any setter" or
handler that can handle it) should result in a failure (by throwing a
{#link JsonMappingException}) or not.

How to display Json data in JavaFX

My JavaFX application handles large amounts Json data. How do I visualize the simplest way JSON data in a table that also must be editable?
The obvious method is to convert JSON to Java objects but for a number of reasons I would like to avoid that.
UPDATE, from comment below I have tried this(feeding ListView directly).
string json = "[{\"fields\":{\"VENDOR\":[\"xxx""],\"TYPE\":[\"yyyyy\"]}, \"path\": \"C:\"}]";
#FXML
private ListView idListView;
JsonReader reader = Json.createReader(new StringReader(json));
public JsonArray myItems = reader.readArray();
reader.close();
public ObservableList<JsonObject> olist;
oList = FXCollections.observableArrayList((JsonObject[])myItems.toArray())
idListView.setItems(oList);
Not working for me. What can I do diffently?
/regards
//lg

I followed this advise "You may convert the json data to map and follow Example 12-12 Adding Map Data to the Table. – Uluk Biy Jun 5 at 9:05"

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Convert CSVwriter to inputstream - csv

Consider replacing the FileWriter you are using with a PipedWriter, creating it with a PipedReader that you would use when creating the CSVReader. You can find an example of the PipedReader Writer here.

Yes, you can. The solution is to use InputStreamReader to read the file and pass that stream to Buffered reader and read line by line or as you want. You can refer to this for more methods: https://www.geeksforgeeks.org/different-ways-reading-text-file-java/

Related

How to create a custom POJO for Apache Flink

Unmarshalling with Jackson "The Json input stream must start with an array of Json objects"

How to directly convert MongoDB Document do Jackson JsonNode in Java

jackson-dataformat-csv does not ignore unknown properties

How to display Json data in JavaFX

Categories

Resources