IDA Pro Update String References - reverse-engineering

I am currently trying to shift a string into an empty memory address and then update it's old reference. It is clear to me how to move the string into a space with enough 00 bytes, but I am unsure how to update its reference (so "aSup" points to the new memory address rather than the old one, which has been replaced with something else).
In the above, I am trying to change "sup" to a longer string. My plan was to move the longer string into the sequence of 00 bytes under it, and then update the aSup reference to point to that new, longer string. However, I am unsure of how to update the reference as I do not see any edit functionality in IDA Pro to accomplish this.
Any help is appreciated, thanks!

Related

How to add back comments/whitespaces in translator using the Antlr4's visitor model

I'm currently writing a TSQL (Sybase/Microsoft SQL) to MySQL translator using the ANTLR4 visitor approach.
I'm able to push comments and whitespaces to different channels so that I can use that information later.
What's not super clear is:
how do I get the data back?
and more importantly how do I plug the comments and whitespaces back into my translated MySQL code?
Re: #1, this seems to work to get the list of all tokens including the comments/whitespaces:
public static List<Token> getHiddenTokensFromString(String sqlIn, int hiddenChannel) {
CharStream charStream = CharStreams.fromString(sqlIn);
CaseChangingCharStream upper = new CaseChangingCharStream(charStream, true);
TSqlLexer lexer = new TSqlLexer(upper);
CommonTokenStream commonTokenStream = new CommonTokenStream(lexer, hiddenChannel);
commonTokenStream.fill();
List<Token> hiddenTokens = commonTokenStream.getTokens();
return hiddenTokens;
}
Re #2, what makes it particularly challenging is that as part of the translation, lines of SQL have to be moved around, some lines removed and some lines added.
Any help will be greatly appreciated.
Thanks.
The ANTLR4 lexer creates a number of tokens, each with an index (a running number). Provided you didn't just skip a token, all tokens are available for later inspection, once the parsing step is done, regardless of their channels (the channel is actually just a number property on a token).
So, given you have a token you want to translate, get its index and then ask the token stream for the tokens with the next smaller index or next higher index. These are usually the hidden whitespaces.
Once you have the whitespace token use its start and stop index to get the original text from the char stream. And since you know where you are in the translation process when you do that, it should be easy to know where to insert the original text.

Why does tcl consider an empty string a double?

Quite confused about this one:
$ tclsh
% string is double {}
1
Why would tcl consider an empty string to be a valid double?
Summary: It's a bit of a misdesign and was a place where the initial use-case was the wrong one, but we can't change it in 8.* (I'm not sure about Tcl 9.0; we still want to avoid gratuitous changes there).
The string is command was originally designed to support the Tk entry widget's validation options. These let the widget respond to typing (or focus changs) by checking whether the change made the widget be in a valid state, such as holding a integer. If you wanted that, you'd just do this:
entry $w -validate key -vcmd {string is integer %P} -invcmd {bell}
Then, if you pressed a letter key, say A, with the cursor in the middle of an integer, the edit would be rejected and the system would make a warning noise. Really easy.
There's only one slight problem. If you had selected all the text in the entry and the pressed a digit, the edit would also be rejected (if string is was strict by default). The problem is that there's an intermediate transition state in the edit where the old text is deleted but before the new text is inserted: the validation occurs twice in such a situation, once for the delete and once for the insert. (It has to be that way because of the way things are tied together under the hood.) That's a terrible user experience, so string is was made lax by default so that this use case would work.
It's not a decision I agreed with — it should have been the other way round, with you needing to request laxity in the test if you want it, which would have added very little overhead here while allowing other uses to be saner — but I was just an ordinary user at that point. I prefer to use a multi-stage validation in my forms, such as using keypress level validation as a soft validation that allows bad input while the user is part-way through using the form, and just indicates that it knows that problems exist anyway, via techniques such as adjusting background colours and disabling submit buttons. (But that's off-topic for your question…)
Library command design is tricky. It takes careful consideration of use-cases to get right. Sometimes we fail.
The problem originated in code that was external to Tcl and Tk in about the time of Tcl 8.1.0. Most of the patch that introduced this was very good (it also gave us commands such as string equal and string map) but this was an aspect that could have done with a little more cooking.

How to find a utf8 code from html document?

I'm currently scraping info from a website which uses icon fonts to identify information. When I find the element that contains the icon I get the "󲁋" character as expected. I want to identify the utf8 code of the character and as such be able to identify which symbol was used.
I'm looking to do something along these lines:
For Each HTMLElement in HTMLDocument.getElementsbyClassName("icon-class")
utf8code = HTMLElement.innerText
If utf8code = U+00AE Then
'do things
End If
Next
Ok, Whilst I wasn't able to fully achieve the goal of identifying the utf8 code of any character I did manage to find a way to identify the characters for my use case.
As it turned out, in my case there are around 30 characters and they appear more or less sequentially in the UTF8 codepage. Then the subject was to understand how the UTF8 code is formed, and user #RemyLebeau helped point me in the right direction. This video was very helpful for that: https://youtu.be/MijmeoH9LT4
My own summation is as follows:
1st byte: remove the first n+1 bits where n = the total number of bytes found
2nd - nth byte: remove the first two bits
the result should be combined starting from the rightmost bit and moving left, any spaces left to make a multiple of 8 should be filled with 0s.
so as in my example with 4 bytes:
243, 178, 129, 139
11110011, 10110010, 10000001, 10001011
11110-011, 10-110010, 10-000001, 10-001011
000(011)(11, 0010)(0000, 01)(001011)
00001111, 00100000, 01001011
F, 20, 4B
now the code I used to help identify which character I was finding:
Dim utf8Encoding As New System.Text.UTF8Encoding(True)
Dim encodedString() As Byte
encodedString = utf8Encoding.GetBytes(HTML_Element.innerText)
Select Case encodedstring(3)
Case 147
Case 155
End Select
In my particular case I was able to use a hashtable to relate the value of the 4th byte to a separate value that I needed.
Is this a good solution? no, it only works in specific cases and being able to simply obtain the UTF8 code would create a solution that is more effective and elegant for all use cases. But as this is a project for personal use only, and through a combination of lack of personal understanding and lack of people willing to help me understand, this solution works for me and so I figured I would include it in case anybody finds themselves in a similar situation where the above shortcut might help.

MUMPS can't format Number to String

I am trying to convert larg number to string in MUMPS but I can't.
Let me explain what I would like to do :
s A="TEST_STRING#12168013110012340000000001"
s B=$P(A,"#",2)
s TAB(B)=1
s TAB(B)=1
I would like create an array TAB where variable B will be a primary key for array TAB.
When I do ZWR I will get
A="TEST_STRING#12168013110012340000000001"
B="12168013110012340000000001"
TAB(12168013110012340000000000)=1
TAB("12168013110012340000000001")=1
as you can see first SET recognize variable B as a number (wrongly converted) and second SET recognize variable B as a string ( as I would like to see ).
My question is how to write SET command to recognize variable B as a string instead of number ( which is wrong in my opinion ).
Any advice/explanation will be helpful.
This may be a limitation of sorting/storage mechanism built into MUMPS and is different between different MUMPS implementations. The cause is that while variable values in MUMPS are non typed, index values are -- and numeric indices are sorted before string ones. When converting a large string to number, rounding errors may occur. To prevent this from happening, you need to add a space before number in your index to explicitly treat it as string:
s TAB(" "_B)=1
As far as I know, Intersystems Cache doesn't have this limitation -- at least your code works fine in Cache and in documentation they claim to support up to 309 digits:
http://docs.intersystems.com/cache20141/csp/docbook/DocBook.UI.Page.cls?KEY=GGBL_structure#GGBL_C12648
I've tried to recreate your scenario, but I am not seeing the issue you're experiencing.
It actually is not possible ( in my opinion ) for the same command executed immediately ( one execution after another) to produce two different results.
s TAB(B)=1
s TAB(B)=1
for as long the value of B did not change between the executions, the result should be:
TAB("12168013110012340000000001")=1
Example of what GT.M implementation of MUMPS returns in your case

PDF Open Parameters: comment=commentID doesn't work

According to Adobe's Manual on PDF Open Parameters PDF files can be opened with certain parameters from command line or from a link in HTML.
These open Parameters include page=pagenum, zoom=scale, comment=commentID and others (the first parameter should be preceded with a # and the next should be preceded with a &
The official PDF Open Parameters from adobe gives this example:
#page=1&comment=452fde0e-fd22-457c-84aa-2cf5bed5a349
but the comment part doesn't work for me!
page=pagenum and zoom=scale work for me well. But comment=commentID does not work. I tried on Adobe reader 6.0.0 and Adobe Pro Extended 9.0.0: I can't get to the specified comment.
Also, I get the comment ID by exporting the comments in XFDF format and in the resulting file, there is a name attribute for every comment that I hope corresponds to the ID (well, the appearance looks like the example in the manual).
I thought maybe there is a setting that I should first enable (or maybe disable in adobe) or maybe I am getting the comment IDs wrong, or maybe something else?!
Any help would be extremely appreciated
According to the docs, you must include a page=X along with your comment=foo. Your copied sample has it, but it's copied from the docs, not something you did yourself.
Are you missing a page= when setting comment?
BASTARDS!
From the last page of the manual you linked:
URL Limitations
●Only one digit following a decimal point is retained for float values.
●Individual parameters, together with their values (separated by & or #), can be no greater then 32 characters in length.
Emphasis added.
The comment ID is a 16-byte value expressed as hex, with four hyphens thrown in to break up the monotony. That's 36 characters right there... starting with "comment=" adds another 8 characters. 44 characters total.
According to that, a comment ID can NEVER WORK, including the samples they show in their docs.
Are you just trying it on the command line, or have you tried via a web browser too? I wonder if that makes a difference. If not, we're looking at a feature that CANNOT WORK. EVER... and probably never has.