I am using univocity bean processor for file parsing. I was able to successfully use it on my local box. But on deploying the same code on an environment with multiple hosts, the parser is showing inconsistent behavior. Say for invalid files, it is not failing processing and also for valid files it fails processing some times.
Would like to know if bean processor implementation suitable for a multi-threaded distributed environment.
Sample code:
private void validateFile(#Nonnull final File inputFile) throws NonRetriableException {
try {
final BeanProcessor<TargetingInputBean> rowProcessor = new BeanProcessor<TargetingInputBean>(
TargetingInputBean.class) {
#Override
public void beanProcessed(#Nonnull final TargetingInputBean targetingInputBean,
#Nonnull final ParsingContext context) {
final String customerId = targetingInputBean.getCustomerId();
final String segmentId = targetingInputBean.getSegmentId();
log.debug("Validating customerId {} segmentId {} for {} file", customerId, segmentId, inputFile
.getAbsolutePath());
if (StringUtils.isBlank(customerId) || StringUtils.isBlank(segmentId)) {
throw new DataProcessingException("customerId or segmentId is blank");
}
try {
someValidation(customerId);
} catch (IllegalArgumentException ex) {
throw new DataProcessingException(
String.format("customerId %s is not in required format. Exception"
+ " message %s", customerId, ex.getMessage()),
ex);
}
}
};
rowProcessor.setStrictHeaderValidationEnabled(true);
final CsvParser parser = new CsvParser(getCSVParserSettings(rowProcessor));
parser.parse(inputFile);
} catch (TextParsingException ex) {
throw new NonRetriableException(
String.format("Exception=%s occurred while getting & parsing targeting file "
+ "contents, error=%s", ex.getClass(), ex.getMessage()),
ex);
}
}
private CsvParserSettings getCSVParserSettings(#Nonnull final BeanProcessor<TargetingInputBean> rowProcessor) {
final CsvParserSettings parserSettings = new CsvParserSettings();
parserSettings.setProcessor(rowProcessor);
parserSettings.setHeaderExtractionEnabled(true);
parserSettings.getFormat().setDelimiter(AIRCubeTargetingFileConstants.FILE_SEPARATOR);
return parserSettings;
}
TargetingInputBean:
public class TargetingInputBean {
#Parsed(field = "CustomerId")
private String customerId;
#Parsed(field = "SegmentId")
private String segmentId;
}
Are you using the latest version?
I just realized you are probably affected by a bug introduced in version 2.5.0 that was fixed in version 2.5.6 if I'm not mistaken. This plagued me for a while as it was an internal concurrency issue that was hard to track down. Basically when you pass a File without an explicit encoding it will try to find a UTF BOM marker in the input (effectively consuming the first character) to determine the encoding automatically. This happened only for InputStreams and Files.
Anyway, this has been fixed so simply updating to the latest version should get rid of the problem for you (please let me know if you are not using version 2.5.something)
If you want to remain with the current version you have there, the error will be gone if you call
parser.parse(inputFile, Charset.defaultCharset());
This will prevent the parser from trying to discover whether there's a BOM marker in your file, therefore avoiding that pesky bug.
Hope this helps
Related
I would like to catch this exception rather than simply returning a 500 to the end users which is a poor experience, at least in my application.
The intention would be to return the user back to the form page with some feedback for them to try again.
The current experience is to throw the user back a 500 and the following is printed to the logs;
Caused by: org.apache.tomcat.util.http.fileupload.FileUploadBase$SizeLimitExceededException: the request was rejected because its size (157552) exceeds the configured maximum (1024)
Crediting #james-kleeh for this head start;
But I could only get this working on Grails 4.0.0.M2 when I extend the StandardServletMultipartResolver implementation which is what is used as default. Then the maxFileSize limits continue to be resolved from config (yaml).
public class MyMultipartResolver extends StandardServletMultipartResolver {
static final String FILE_SIZE_EXCEEDED_ERROR = "fileSizeExceeded"
public MultipartHttpServletRequest resolveMultipart(HttpServletRequest request) {
try {
return super.resolveMultipart(request)
} catch (MaxUploadSizeExceededException e) {
request.setAttribute(FILE_SIZE_EXCEEDED_ERROR, true)
return new DefaultMultipartHttpServletRequest(request, new LinkedMultiValueMap<String, MultipartFile>(), new LinkedHashMap<String, String[]>(), new LinkedHashMap<String, String>());
}
}
}
With the following in resources.groovy;
// catch exception when max file size is exceeded
multipartResolver(MyMultipartResolver)
You need to subsequently check for the FILE_SIZE_EXCEEDED_ERROR attribute in the controller and handle accordingly.
I'm using the Newtonsoft json.NET parser for JSON parsing. In my deserialization, I have the following code so that errors when converting from String to Int will not force me to throw away the entire object:
var param2 = new JsonSerializerSettings
{
Error = delegate(object sender, Newtonsoft.Json.Serialization.ErrorEventArgs args)
{
args.ErrorContext.Handled = true;
}
};
bcontent = JsonConvert.DeserializeObject<BContent>(json, param2);
I do not have control of the input data and the parsing errors are very common so I need to be versatile enough to handle them. Unfortunately, marking all errors as handled causes the deserialization to not terminate when it runs into a different error in a constrained environment.
What I want to do is to mark the errors as handled when they are of a type with a similar message as:
Could not convert string to integer....
But not when they are something different, such as this error which causes the hang:
Unterminated string. Expected delimiter...
What I can do is something like this:
Error = delegate(object sender, Newtonsoft.Json.Serialization.ErrorEventArgs args)
{
if (args.ErrorContext.Error.Message.Contains("convert string to integer"))
args.ErrorContext.Handled = true;
}
But it seems like there's no other way to determine a more specific error than JsonReaderException. Has anyone encountered this issue before and found a better workaround than a String.Contains()?
Given:
I have a List<ComplexObjectThatContainsOtherObjectsAndEvenLists> and I want to retain this data across pages/requests. This objects is quite large containing up to 1000 objects.
Current implementation:
What I am currently doing is serializing this complex object using below (I just found this code here in SO and I am grateful to the author who unfortunately I cannot recall, I am sorry)
public static String serialize(Object object) {
ByteArrayOutputStream byteaOut = new ByteArrayOutputStream();
GZIPOutputStream gzipOut = null;
try {
gzipOut = new GZIPOutputStream(new Base64OutputStream(byteaOut));
gzipOut.write(new Gson().toJson(object).getBytes("UTF-8"));
} catch(Exception e) {
return null;
} finally {
if (gzipOut != null) try { gzipOut.close(); } catch (IOException logOrIgnore) {}
}
return new String(byteaOut.toByteArray());
}
and hiding the String output <input type="hidden"> in my page and passing at it back to my controller whenever I need it back. This string is around 1300-2000 characters in length.
Question:
Is saving this String in session better? (see below)
session.setAttribute("mySerializedString", mySerializedString);
Can you please provide pros and cons?
My pros and cons so far (I am not sure though):
I'm not sure but hidden implementation I think can have an effect while the page is rendered (since it's too long) and when it's submitted back to the controller, although this doesn't trouble me of manually unsetting the session variable if I choose the session implementation.
There is a web-service deployed on tomcat 6 and exposed via apache-cxf 2.3.3. A generated sources stubs using wsdl2java to be able to call this service.
Things seemed fine until I sent big request(~1Mb). This request wasn't processed and failing with exception:
Interceptor for {http://localhost/}ResourceAllocationServiceSoapService has thrown
exception, unwinding now org.apache.cxf.binding.soap.SoapFault:
Error reading XMLStreamReader.
...
com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog
at [row,col {unknown-source}]: [1,0]
Is some kind of max request length here, I'm totally stuck with it.
Vladimir's suggestion worked. This code below will help others with understanding where to put the 1000000.
public void handleMessage(SoapMessage message) throws Fault {
// Get message content for dirty editing...
InputStream inputStream = message.getContent(InputStream.class);
if (inputStream != null)
{
String processedSoapEnv = "";
// Cache InputStream so it can be read independently
CachedOutputStream cachedInputStream = new CachedOutputStream(1000000);
try {
IOUtils.copy(inputStream,cachedInputStream);
inputStream.close();
cachedInputStream.close();
InputStream tmpInputStream = cachedInputStream.getInputStream();
try{
String inputBuffer = "";
int data;
while((data = tmpInputStream.read()) != -1){
byte x = (byte)data;
inputBuffer += (char)x;
}
/**
* At this point you can choose to reformat the SOAP
* envelope or simply view it just make sure you put
* an InputStream back when you done (see below)
* otherwise CXF will complain.
*/
processedSoapEnv = fixSoapEnvelope(inputBuffer);
}
catch(IOException e){
}
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
// Re-set the SOAP InputStream with the new envelope
message.setContent(InputStream.class,new ByteArrayInputStream( processedSoapEnv.getBytes()));
/**
* If you just want to read the InputStream and not
* modify it then you just need to put it back where
* it was using the CXF cached inputstream
*
* message.setContent(InputStream.class,cachedInputStream.getInputStream());
*/
}
}
I figured out what was wrong. Actually it was bug inside interceptor's code:
CachedOutputStream requestStream = new CachedOutputStream()
When I replaced this with
CachedOutputStream requestStream = new CachedOutputStream(1000000);
things start working fine.
So the request was just trunkated during copying of streams.
I run into same issue of geting "com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog" when using CachedOutputStream class.
Looking at sources of CachedOutputStream class the threshold is used to switch between storing stream's data from "in memory" to "a file".
Assuming stream operates on data that exceeds threshold it gets stored in a file thus following code is going to break
IOUtils.copy(inputStream,cachedInputStream);
inputStream.close();
cachedInputStream.close(); //closes the stream, the file on disk gets deleted
InputStream tmpInputStream = cachedInputStream.getInputStream(); //returned tmpInputStream is brand *empty* one
// ... reading tmpInputStream here will produce WstxEOFException
Increasing 'threshold' does help as all stream data is stored into memory and in such scenario calling cachedInputStream.close() does not really close the underlying stream implementation so one can still read from it later on.
Here is 'fixed' version of above code (at least it worked without exception for me)
IOUtils.copy(inputStream,cachedInputStream);
inputStream.close();
InputStream tmpInputStream = cachedInputStream.getInputStream();
cachedInputStream.close();
// reading from tmpInputStream here works fine
Temporary file gets deleted when close() is called on tmpInputStream and there are no more other references to it, see source code of CachedOutputStream.maybeDeleteTempFile()
I am getting time out from using JsonpRequestBuilder.
The entry point code goes like this:
// private static final String SERVER_URL = "http://localhost:8094/data/view/";
private static final String SERVER_URL = "http://www.google.com/calendar/feeds/developer-calendar#google.com/public/full?alt=json-in-script&callback=insertAgenda&orderby=starttime&max-results=15&singleevents=true&sortorder=ascending&futureevents=true";
private static final String SERVER_ERROR = "An error occurred while "
+ "attempting to contact the server. Please check your network "
+ "connection and try again.";
/**
* This is the entry point method.
*/
public void onModuleLoad() {
JsonpRequestBuilder requestBuilder = new JsonpRequestBuilder();
// requestBuilder.setTimeout(10000);
requestBuilder.requestObject(SERVER_URL, new Jazz10RequestCallback());
}
class Jazz10RequestCallback implements AsyncCallback<Article> {
#Override
public void onFailure(Throwable caught) {
Window.alert("Failed to send the message: " + caught.getMessage());
}
#Override
public void onSuccess(Article result) {
// TODO Auto-generated method stub
Window.alert(result.toString());
}
The article class is simply:
import com.google.gwt.core.client.JavaScriptObject;
public class Article extends JavaScriptObject {
protected Article() {};
}
The gwt page, however, always hit the onFailure() callback and show this alert:
Failed to send the message. Timeout while calling <url>.
Fail to see anything on the Eclipse plugin console. I tried the url and it works perfectly.
Would appreciate any tip on debugging technique or suggestion
Maybe you should set the callback function explicitly via setCallbackParam, since you have callback=insertAgenda in your url - I presume that informs the server what should be the name of the callback function that wraps the JSON.
Also, it's worth checking Firebug's console (or a similar tool for your browser) - even if GWT doesn't report any exceptions, Firebug still might.
PS: It's useful to use a tool like Firebug to see if the application does in fact receive the response from the server (that would mean that, for example, you do need the setCallbackParam call) or maybe there's something wrong on the server side (for whatever reason).
You have to read the callback request-Parameter (default callback, value something like __gwt_jsonp__.P0.onSuccess) on serversite and have to modify the output to
<callback>(<json>);
In this case:
__gwt_jsonp__.P0.onSuccess(<json>);
Both of these guys are absolutely correct, but here is a concrete example to help you understand exactly what they are referring too.
This is a public JSON api. Take a look at the results:
http://ws.geonames.org/postalCodeLookupJSON?postalcode=M1&country=GB&maxRows=4
This public API supports JSONP through the predefined parameter 'callback'. Basically whatever value you pass into callback, will be used as the function name to wrap around the JSON data you desire. Take a look at the results of these few requests:
http://ws.geonames.org/postalCodeLookupJSON?postalcode=M1&country=GB&maxRows=4&callback=totallyMadeUp
http://ws.geonames.org/postalCodeLookupJSON?postalcode=M1&country=GB&maxRows=4&callback=trollingWithJSONP
It could be happening because of another reason, that the webservice call is returning a JSON object and but the callback is expecting JSONP object (note there is a difference).
So if you are dealing with google maps api, and you are seeing this exception, you need to change it to api provide by maps api, something like
final GeocoderRequest request = GeocoderRequest.create();
request.setAddress(query);
try {
GWT.log("sending GeoCoderRequest");
if (m_geocoder == null) {
m_geocoder = Geocoder.create();
}
m_geocoder.geocode(request, new Geocoder.Callback() {
#Override
public void handle(final JsArray<GeocoderResult> results,
final GeocoderStatus status) {
handleSuccess(results, status);
}
});
} catch (final Exception ex) {
GWT.log("GeoCoder", ex);
}
Or else you could use RequestBuilder as in gwt library.