Im writing some junits, and have this check, comparing the keys and values of two hashmaps
Iterator<Map.Entry<String, String>> it = expected.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<String, String> pairs = (Map.Entry<String, String>) it.next();
assertTrue("Checks key exists", actual.containsKey(pairs.getKey()));
assertThat("Checks value", actual.get(pairs.getKey()), equalTo(pairs.getValue()));
}
Works great, but i have a value that trips it up:
java.lang.AssertionError: Checks value
Expected: "Member???s "
but: was "Member���s "
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
I checked the data, and the data is correct. it appears that the triple ? are tripping up something somehow. Does anyone know why this would be tripped? It seems pretty basic to me, its not even hamcrest getting messed up, its the actual assert.
You have an encoding conflict. Many different ways this could manifest itself, but generally caused by not enforcing consistent encoding across the board.
Assuming you are using UTF-8 somewhere...
If using Maven, set property project.build.sourceEncoding to UTF-8 See doc for more details.
Other build systems will certainly have options to specify code and resource file encodings.
If using IO to read (or write), always specify the encoding. For example, when reading from a file:
InputStream is = new FileInputStream(file), "UTF8");
In short, find any build system setting for encoding, and any point where IO is involved when reading text, and ensure your desired encoding is set.
Related
I am trying to create a POST request that contains multipart-form-data that requires NT Credentials. The authentication request causes the POST to be resent and I get a unrepeatable entity exception.
I tried wrapping the MultipartContent entity that is produced with a BufferedHttpEntity but it throws NullPointerExceptions?
final GenericUrl sau = new GenericUrl(baseURI.resolve("Record"));
final MultipartContent c = new MultipartContent().setMediaType(MULTIPART_FORM_DATA).setBoundary("__END_OF_PART__");
final MultipartContent.Part p0 = new MultipartContent.Part(new HttpHeaders().set("Content-Disposition", format("form-data; name=\"%s\"", "RecordRecordType")), ByteArrayContent.fromString(null, "C_APP_BOX"));
final MultipartContent.Part p1 = new MultipartContent.Part(new HttpHeaders().set("Content-Disposition", format("form-data; name=\"%s\"", "RecordTitle")), ByteArrayContent.fromString(null, "JAVA_TEST"));
c.addPart(p0);
c.addPart(p1);
The documentation for ByteArrayContent says
Concrete implementation of AbstractInputStreamContent that generates repeatable input streams based on the contents of byte array.
Making all the parts repeatable does not solve the problem. Because this code
System.out.println("c.retrySupported() = " + c.retrySupported()); outputs c.retrySupported() = true.
I found the following documentation:
1.1.4.1. Repeatable entities An entity can be repeatable, meaning its content can be read more than once. This is only possible with self
contained entities (like ByteArrayEntity or StringEntity)
I have now converted my MultipartContent to a ByteArrayContent with a multi/part-form media type by extracting the string contents and still get the same error!
But I still get the following exception when I try and call request.execute().
Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry request with a non-repeatable request entity.
So how do I go about convincing the ApacheHttpTransport to create a repeatable Entity?
I had to modify all the classes that inherited from HttpContent so that they would report back correctly with .retrySupported() so that the when the ApacheHttpTransport code was entered it would create repeatable content correctly.
The changes were made against version 1.20.0 because that is what I was using. I am submitting a pull request against dev branch HEAD so hopefully, this or some version of this will make it into the next release.
Here are the modifications that need to be merged in.
If content length of all parts in multipart entity is known (returned as a non negative value) the entity will be treated as repeatable. The easiest way to make multipart entity repeatable is to make all its parts repeatable.
The Reference Source page for stringbuilder.cs has this comment in the ToString method:
if (chunk.m_ChunkLength > 0)
{
// Copy these into local variables so that they
// are stable even in the presence of ----s (hackers might do this)
char[] sourceArray = chunk.m_ChunkChars;
int chunkOffset = chunk.m_ChunkOffset;
int chunkLength = chunk.m_ChunkLength;
What does this mean? Is ----s something a malicious user might insert into a string to be formatted?
The source code for the published Reference Source is pushed through a filter that removes objectionable content from the source. Verboten words are one, Microsoft programmers use profanity in their comments. So are the names of devs, Microsoft wants to hide their identity. Such a word or name is substituted by dashes.
In this case you can tell what used to be there from the CoreCLR, the open-sourced version of the .NET Framework. It is a verboten word:
// Copy these into local variables so that they are stable even in the presence of race conditions
Which was hand-edited from the original that you looked at before being submitted to Github, Microsoft also doesn't want to accuse their customers of being hackers, it originally said races, thus turning into ----s :)
In the CoreCLR repository you have a fuller quote:
Copy these into local variables so that they are stable even in the presence of race conditions
Github
Basically: it's a threading consideration.
In addition to the great answer by #Jeroen, this is more than just a threading consideration. It's to prevent someone from intentionally creating a race condition and causing a buffer overflow in that manner. Later in the code, the length of that local variable is checked. If the code were to check the length of the accessible variable instead, it could have changed on a different thread between the time length was checked and wstrcpy was called:
// Check that we will not overrun our boundaries.
if ((uint)(chunkLength + chunkOffset) <= ret.Length && (uint)chunkLength <= (uint)sourceArray.Length)
{
///
/// imagine that another thread has changed the chunk.m_ChunkChars array here!
/// we're now in big trouble, our attempt to prevent a buffer overflow has been thawrted!
/// oh wait, we're ok, because we're using a local variable that the other thread can't access anyway.
fixed (char* sourcePtr = sourceArray)
string.wstrcpy(destinationPtr + chunkOffset, sourcePtr, chunkLength);
}
else
{
throw new ArgumentOutOfRangeException("chunkLength", Environment.GetResourceString("ArgumentOutOfRange_Index"));
}
}
chunk = chunk.m_ChunkPrevious;
} while (chunk != null);
Really interesting question though.
Don't think that this is the case - the code in question copies to local variables to prevent bad things happening if the string builder instance is mutated on another thread.
I think the ---- may relate to a four letter swear word...
I'm learning Dart and was reading the article Using Dart with JSON Web Services, which told me that I could get help with type checking when converting my objects to and from JSON. I used their code snippet but ended up with compiler warnings. I found another Stack Overflow question which discussed the same problem, and the answer was to use the #proxy annotation and implement noSuchMethod. Here's my attempt:
abstract class Language {
String language;
List targets;
Map website;
}
#proxy
class LanguageImpl extends JsonObject implements Language {
LanguageImpl();
factory LanguageImpl.fromJsonString(string) {
return new JsonObject.fromJsonString(string, new LanguageImpl());
}
noSuchMethod(i) => super.noSuchMethod(i);
}
I don't know if the noSuchMethod implementation is correct, and #proxy seems redundant now. Regardless, the code doesn't do what I want. If I run
var lang1 = new LanguageImpl.fromJsonString('{"language":"Dart"}');
print(JSON.encode(lang1));
print(lang1.language);
print(lang1.language + "!");
var lang2 = new LanguageImpl.fromJsonString('{"language":13.37000}');
print(JSON.encode(lang2));
print(lang2.language);
print(lang2.language + "!");
I get the output
{"language":"Dart"}
Dart
Dart!
{"language":13.37}
13.37
type 'String' is not a subtype of type 'num' of 'other'.
and then a stacktrace. Hence, although the readability is a little bit better (one of the goals of the article), the strong typing promised by the article doesn't work and the code might or might not crash, depending on the input.
What am I doing wrong?
The article mentions static types in one paragraph but JsonObject has nothing to do with static types.
What you get from JsonObject is that you don't need Map access syntax.
Instead of someMap['language'] = value; you can write someObj.language = value; and you get the fields in the autocomplete list, but Dart is not able to do any type checking neither when you assign a value to a field of the object (someObj.language = value;) nor when you use fromJsonString() (as mentioned because of noSuchMethod/#proxy).
I assume that you want an exception to be thrown on this line:
var lang2 = new LanguageImpl.fromJsonString('{"language":13.37000}');
because 13.37 is not a String. In order for JsonObject to do this it would need to use mirrors to determine the type of the field and manually do a type check. This is possible, but it would add to the dart2js output size.
So barring that, I think that throwing a type error when reading the field is reasonable, and you might have just found a bug-worthy issue here. Since noSuchMethod is being used to implement an abstract method, the runtime can actually do a type check on the arguments and return values. It appears from your example that it's not. Care to file a bug?
If this was addressed, then JsonObject could immediate read a field after setting it to cause a type check when decoding without mirrors, and it could do that check in an assert() so that it's only done in checked mode. I think that would be a nice solution.
So I'm creating a vim script that needs to load and parse a JSON file into a local object graph. I searched and I couldn't find any native way to process a JSON file, and I don't want to add any dependencies to the script. So I wrote my own function to parse the JSON string (gotten from the file), but it's really slow. At the moment, I iterate through each character in the file like so:
let len = strlen(jsonString) - 1
let i = 0
while i < len
let c = strpart(jsonString, i, 1)
let i += 1
" A lot of code to process file....
" Note: I've tried short cutting the process by searching for enclosing double-quotes when I come across the initial double quotes (also taking into account escaping '\' character. It doesn't help
endwhile
I've also tried this method:
for c in split(jsonString, '\zs')
" Do a lot of parsing ....
endfor
For reference, a file with ~29,000 characters takes about 4 seconds to process, which is unacceptable.
Is there a better way to iterate over a string in vim script?
Or better yet, have I missed a native function to parse JSON?
Update:
I asked for no dependencies because I:
Didn't want to deal with them
Genuinely wanted some ideas for best way to do this without someone else's work.
Sometimes I just like to do things manually even though the problem has already been solved.
I'm not against plugins or dependencies at all, it's just that I'm curious. Thus the question.
I ended up creating my own function to parse the JSON file. I was creating a script that could parse the package.json file associated with node.js modules. Because of this, I could rely on a fairly consistent format and quit the processing whenever I'd retrieved the information I needed. This usually cut out large chunks of the file since most developers put the largest chunk of the file, their "readme" section, at the end. Because the package.json file is strictly defined, I left the process somewhat fragile. It assumed a root dictionary { } and actively looks for certain entries. You can find the script here: https://github.com/ahayman/vim-nodejs-complete/blob/master/after/ftplugin/javascript.vim#L33.
Of course, this doesn't answer my own question. It's only the solution to my unique problem. I'll wait a few days for new answers and pick the best one before the bounty ends (already set an alarm on my phone).
The simplest solution with the least dependencies is just using the json_decode vim function.
let dict = json_decode(jsonString)
Even though Vim's origin dates back a lot it happens that its internal string() eval() representation is that close to JSON that its likely to work unless you need special characters.
You can lookup the implementation here which even supports true/false/null if you want:
https://github.com/MarcWeber/vim-addon-json-encoding
Better use that library (vim-addon-manager allows to install dependencies easily).
Now it depends on your data whether this is good enough.
Now Benjamin Klein posted your question to vim_use which is why I'm replying.
Best and fast replies happen if you subscribe to the Vim mailinglist.
Goto vim.sf.net and follow the community link.
You cannot expect the Vim community to scrape stackoverflow.
I've added the keyword "json" and "parsing" to that little code that it can be found easier.
If this solution does not work for you you can try the many :h if_* bindings or write an external script which extracts the information you're looking for, or turns JSON into Vim's dictionary representation which can be read by eval() escaping special characters you care about correctly.
If you seek for completely correct solution omitting dependencies is one of the worst thing you can do. The eval() variant mentioned by #MarcWeber is one of the fastest, but it has its disadvantages:
Using solution for securing eval I mentioned in comment makes it no longer the fastest. In fact after you use this it makes eval() slower by more then an order of magnitude (0.02s vs 0.53s in my test).
It does not respect surrogate pairs.
It cannot be used to verify that you have correct JSON: it accepts some strings (e.g. "\<C-o>") that are not JSON strings and it allows trailing commas.
It fails to give normal error messages. It fails badly if you use vam#VerifyIsJSON I mentioned in p.1.
It fails to load floating point values like 1e10 (vim requires numbers to look like 1.0e10, but numbers like 1e10 are allowed: note “and/or” in the first paragraph).
. All of the above (except for the first) statements also apply to vim-addon-json-encoding mentioned by #MarcWeber because it uses eval. There are some other possibilities:
Fastest and the most correct is using python: pyeval('json.loads(vim.eval("varname"))'). Not faster then eval, but fastest among other possibilities. (0.04 in my test: approximately two times slower then eval())
Note that I use pyeval() here. If you want solution for vim version that lacks this functionality it will no longer be one of the fastest.
Use my json.vim plugin. It has an advantages of slightly better error reporting compared to failed vam#VerifyIsJSON, slightly worse compared to eval() and it correctly loads floating-point numbers. It can be used for verification of strings (it does not accept "\<C-a>"), but it loads lists with trailing comma just fine. It does not support surrogate pairs. It is also very slow: in the test I used (it uses 279702 character long strings) it takes 11.59s to load. Json.vim tries to use python if possible though.
For the best error reporting you can take yaml.vim and purge YAML support out of it leaving only JSON (I once have done the same thing for pyyaml, though in python: see markedjson library used in powerline: it is pyyaml minus YAML stuff plus classes with marks). But this variant is even slower then json.vim and should only be used if the main thing you need is error reporting: 207 seconds for loading the same 279702 character long string.
Note that the only variant mentioned that satisfies both requirements “no dependencies” and “no python” is eval(). If you are not fine with its disadvantages you have to throw away one or both of these requirements. Or copy-paste code. Though if you take speed into account only two candidates are left: eval() and python: if you want to parse json fast you really must use C and only these solutions spend most time in functions written in C.
Most other interpreters (ruby/perl/TCL) do not have pyeval() equivalent so they will be slower even if their JSON implementation is written in C. Some other (lua/racket (mzscheme)) have pyeval() equivalent, but e.g. luaeval('{}') is zero meaning that you will have to add additional step explicitly and recursively converting objects into vim dictionaries and lists (e.g. luaeval('vim.dict({})')) which will impact performance. Cannot say anything about mzeval(), but I have never heard about anybody actually using racket (mzscheme) with vim.
Imagine an instance of some lookup of configuration settings called "configuration", used like this:
if(! string.IsNullOrEmpty(configuration["MySetting"])
{
DoSomethingWithTheValue(configuration["MySetting"]);
}
The meaning of the setting is overloaded. It means both "turn this feature on or off" and "here is a specific value to do something with". These can be decomposed into two settings:
if(configuration["UseMySetting"])
{
DoSomethingWithTheValue(configuration["MySetting"]);
}
The second approach seems to make configuration more complicated, but slightly easier to parse, and it separate out the two sorts of behaviour. The first seems much simpler at first but it's not clear what we choose as the default "turn this off" setting. "" might actually a valid value for MySetting.
Is there a general best practice rule for this?
I find the question to be slightly confusing, because it talks about (1) parsing, and (2) using configuration settings, but the code samples are for only the latter. That confusion means that my answer might be irrelevant to what you intended to ask. Anyway...
I suggest an approach that is illustrated by the following pseudo-code API (comments follow afterwards):
class Configuration
{
void parse(String fileName);
boolean exists(String name);
String lookupString(String name);
String lookupString(String name, String defaultValue);
int lookupInt(String name);
int lookupInt(String name, int defaultValue);
float lookupFloat(String name);
float lookupFloat(String name, float defaultValue);
boolean lookupBoolean(String name);
boolean lookupBoolean(String name, boolean defaultValue);
... // more pairs of lookup<Type>() operations for other types
}
The parse() operation parses a configuration file and stores the parsed data in a convenient format, for example, in a map or hash-table. (If you want, parse() can delegate the parsing to a third-party library, for example, a parser for XML, Java Properties, JSON, .ini files or whatever.)
After parsing is complete, your application can invoke the other operations to retrieve/use the configuration settings.
A lookup<Type>() operation retrieves the value of the specified name and parses it into the specified type (and throws an exception if the parsing fails). There are two overloadings for each lookup<Type>() operation. The version with one parameter throws an exception if the specified variable does not exist. The version with an extra parameter (denoting a default value) returns that default value if the specified variable does not exist.
The exists() operation can be used to test whether a specified name exists in the configuration file.
The above pseudo-code API offers two benefits. First, it provides type-safe access to configuration data (which wasn't a stated requirement in your question, but I think it is important anyway). Second, it enables you to distinguish between "variable is not defined in configuration" and "variable is defined but its value happens to be an empty string".
If you have already committed yourself to a particular configuration syntax, then just implement the above Configuration class as a thin wrapper around a parser for the existing configuration syntax. If you haven't already chosen a configuration syntax and if your project is in C++ or Java, then you might want to look at my Config4* library, which provides a ready-to-use implementation of the above pseudo-code class (with a few extra bells and whistles).