Program for diff-ing binary files? - binary

(The story is relevant...mostly)
So I'm over at my buddy's house playing some RE5 Gold Edition, beat the game, unlock a bunch of stuff, and then I copy the save file to my memory stick so I can take it home with me.
Of course, the save is locked to his profile so I can't play it on my PS3, even though I was there beating everything with him. Lame.
So, I've got this save file sitting on my memory stick, I want to see if I can hack it to work with my profile.
I figure if I just create 2 new saves on different profiles and do nothing else, they should be identical except for the profile information. Then I just have to replace my friend's profile info with mine, and it should work, right?
So I need a tool for diff-ing these 2 binary files so I can quickly locate the parts of the file that are different. I know there are plenty of tools for text, but what about for binary?
(Actually, there are 3 files, DATA0.DAT, PARAM.PFD, and PARAM.SFO... not sure if anyone knows anything about PS3 save files, specifically for Resident Evil 5 Gold)
Don't think it's going to be possible. Apparently the save file is "protected". All it would take to prevent me from tampering with it is if they hash the contents of the data using some unknown algorithm, and then verify the hash matches up upon load. Not sure if they're doing that, but... guess it would be kind of dumb if they weren't doing something similar.

Hex Workshop is one of the premier hex manipulation applications and it has a file compare function.
But be aware that the game may not use a straight foward data saving mechanism, you may be dealing with a custom database structure, or the data may be encrypted. Game developers typically don't make it easy to hack save files, for obvious reasons...

I thought most of compare tools can do that (like Beyond Compare which I love). For example, there is FC.exe in Windows 7 in System folder. Compares ASCII and binaries. See http://support.microsoft.com/kb/159214 for some details.

check out hex workshop. most other hex editors out there should have this feature as well.

Related

How to make a good anti-crack protection?

I will start off with saying I know that it is impossible to prevent your software from reverse engineering.
But, when I take a look at crackmes.de, there are crackmes with a difficulty grade of 8 and 9 (on a scale of 1 to 10). These crackmes are getting cracked by genius brains, who write a tutorial on how to crack it. Some times, such tutorials are 13+ pages long!
When I try to make a crackme, they crack it in 10 minutes. Followed by a "how-to-crack" tutorial with a length of 20 lines.
So the questions are:
How can I make a relatively good anti-crack protection.
Which techniques should I use?
How can I learn it?
...
Disclaimer: I work for a software-protection tools vendor (Wibu-Systems).
Stopping cracking is all we do and all we have done since 1989. So we thoroughly understand how SW gets cracked and how to avoid it. Bottom line: only with a secure hardware dongle, implemented correctly, can you guarantee against cracking.
Most strong anti-cracking relies on encryption (symmetric or public key). The encryption can be very strong, but unless the key storage/generation is equally strong it can be attacked. Lots of other methods are possible too, even with good encryption, unless you know what you are doing. A software-only solution will have to store the key in an accessible place, easily found or vulnerable to a man-in-the-middle attack. Same thing is true with keys stored on a web server. Even with good encryption and secure key storage, unless you can detect debuggers the cracker can just take a snapshot of memory and build an exe from that. So you need to never completely decrypt in memory at any one time and have some code for debugger detection. Obfuscation, dead code, etc, won't slow them down for long because they don't crack by starting at the beginning and working through your code. They are far more clever than that. Just look at some of the how-to cracking videos on the net to see how to find the security detection code and crack from there.
Brief shameless promotion: Our hardware system has NEVER been cracked. We have one major client who uses it solely for anti-reverse engineering. So we know it can be done.
Languages like Java and C# are too high-level and do not provide any effective structures against cracking. You could make it hard for script kiddies through obfuscation, but if your product is worth it it will be broken anyway.
I would turn this round slightly and think about:
(1) putting in place simple(ish) measures so that your program isn't trivial to hack, so e.g. in Java:
obfuscate your code so at least make your enemy have to go to the moderate hassle of looking through a decompilation of obfuscated code
maybe write a custom class loader to load some classes encrypted in a custom format
look at what information your classes HAVE to expose (e.g. subclass/interface information can't be obfuscated away) and think about ways round that
put some small key functionality in a DLL/format less easy to disassemble
However, the more effort you go to, the more serious hackers will see it as a "challenge". You really just want to make sure that, say, an average 1st year computer science degree student can't hack your program in a few hours.
(2) putting more subtle copyright/authorship markers (e.g. metadata in images, maybe subtly embed a popup that will appear in 1 year's time to all copies that don't connect and authenticate with your server...) that hackers might not bother to look for/disable because their hacked program "works" as it is.
(3) just give your program away in countries where you don't realistically have a chance of making a profit from it and don't worry about it too much-- if anything, it's a form of viral marketing. Remember that in many countries, what we see in the UK/US as "piracy" of our Precious Things is openly tolerated by government/law enforcement; don't base your business model around copyright enforcement that doesn't exist.
I have a pretty popular app (which i won't specify here, to avoid crackers' curiosity, of course) and suffered with cracked versions some times in the past, fact that really caused me many headaches.
After months struggling with lots of anti-cracking techniques, since 2009 i could establish a method that proved to be effective, at least in my case : my app has not been cracked since then.
My method consists in using a combination of three implementations :
1 - Lots of checks in the source code (size, CRC, date and so on : use your creativity. For instance, if my app detects tools like OllyDbg being executed, it will force the machine to shutdown)
2 - CodeVirtualizer virutalization in sensitive functions in source code
3 - EXE encryption
None of these are really effective alone : checks can be passed by a debugger, virtualization can be reversed and EXE encryption can be decrypted.
But when you used altogether, they will cause BIG pain to any cracker.
It's not perfect although : so many checks makes the app slower and the EXE encrypt can lead to false positive in some anti-virus software.
Even so there is nothing like not be cracked ;)
Good luck.
Personaly I am fan of server side check.
It can be as simple as authentication of application or user each time it runs. However that can be easly cracked. Or puting some part of code to server side and that would requere a lot more work.
However your program will requere internet connection as must have and you will have expenses for server. But that the only way to make it relatively good protected. Any stand alone application will be cracked relatively fast.
More logic you will move to server side more hard to crack it will get. But it will if it will be worth it. Even large companies like Blizzrd can't prevent theyr server side being reversed engineered.
I purpose the following:
Create in home a key named KEY1 with N bytes randomly.
Sell the user a "License number" with the Software. Take note of his/her name and surname and tell him/her that those data are required to activate the Software, also an Internet conection.
Upload within the next 24 hours to your server the "License number", and the name and surname, also the KEY3 = (KEY1 XOR hash_N_bytes(License_number, name and surname) )
The installer asks for a "Licese_number" and the name and surname, then it sends those data to the server and downloads the key named "KEY3" if those data correspond to a valid sell.
Then the installer makes KEY1 = KEY3 XOR hash_N_bytes(License_number, name and surname)
The installer checks KEY1 using a "Hash" of 16 bits. The application is encrypted with the KEY1 key. Then it decrypts the application with the key and it's ready.
Both the installer and application must have a CRC content check.
Both could check is being debugged.
Both could have encrypted parts of code during execution time.
What do you think about this method?

Reversing an old file format Inbox X

I’m trying to reverse engineer an old medical imaging format called Stentor for interoperability. It was designed by a company of the same name who was subsequently bought by Phillips. But Phillips has forgotten how to read Stentor files. I have a windows program which exports JPEG from Stentor files but it’s closed source. I’d like to automate this process in order to tackle hundreds of files in this format.
The program is late-1990s Win32 or MFC executeable. It runs next to an ActiveX (.ocx) file which I’ve been able to interop with, but that file doesn’t contain the export method. I'm looking for suggestions on how to dissemble the binary in order to unearth the algorithm used to convert Stentor to JPEG. I looked through the Stentor files in hex editor and didn’t find any evidence of JPEG (although hints on finding that would be appreciated too), so I think that the program has a couple of tricks up its sleeve.
Thanks in advance.
Kyle
Few programmers implement complex routines such as image recoding themselves. Instead they tend to license libraries that do that. A very smart way to start would be searching for text strings and see if you can discover the libraries they use. This will subsequently give you a lot of insight into how the data is encoded.
Another good strategy would be to build a program that simply runs the GUI of your export program by sending mouse and keyboard events directly to it. Let this run a few days to complete your export. Reverse engineering the file format is going to be slow and expensive so for a 1 time gig it's probably not worthwhile.

How do i convert hdb file? ... believed to be from act! source

Any ideas ?
I think the original source was a goldmine database, looking around it appears that the file was likely built using an application called ACT which I gather is a huge product I don't really want to be deploying for a one off file total size less than 5 meg.
So ...
Anyone know of a simple tool that I can run this file through to convert it to a standard CSV or something?
It does appear to be (when looking at it in notepad and excel) in some sort of csv type format but it's like the data is encrypted somehow.
Ok this is weird,
I got a little confused because the data looked a complete mess, in actual fact the mess was the data, that's what it was meant to look like.
Simply put, i opened the file in notepad, seemed to have a sort of pattern so i droppped it on excel.
Apparently excel has no issues reading these files ... strange huh !!!
I am unaware of any third party tooling for opening these files specifically, although there is an SDK available for C# which could resolve your problem with a little elbow grease.
The SDK can be aquired for free Here
Also there is a developer forum which could provide some valuable resources including training material with sample code Here
Resources will be provided with the SDK
Also, out of interest since ACT is a Sage product have you any Sage software floating about which you could attempt to access the data with? Most offices have!
Failing all of the above there is a trial available for ACT! Here!
Good luck with your problem!

What should I put in header comments at the top of source files?

I've got lots of source code files written in various languages, but none of them have a standard comment at the top (sometimes even across the same project). Some of them don't have any header comment at all :-)
I've been thinking about creating a standard template that I can use at the top of my source files, and was wondering what fields I should include.
I know I want to include my name and a short description of what the file contains/does. Should I also include the date created? The date last modified? The programmer who last modified the file? What other fields have you found to be useful?
Any tips and comments welcome.
Thanks,
Cameron
This seems to be a dying practice.
Some people here on StackOverflow are against code comments altogether (reasoning that code should be written to be self explanatory) While I wouldn't go that far, some of the points of the anti-comment crowd make sense, such as the fact that comments tend to be out of date.
Header blocks of comments suffer from these symptoms even more so. Every organization I've been with that has had these header blocks, they are out of date. They have a author name of some guy who doesnt even work there any more, a description that does not match the code at all (assuming it ever did) and a last modified date, that once compared with version control history, seems to have missed its last dozen updates.
In my personal opinion, keep comments close to the code. If you want to know purpose of, and/or history of, a code file, use your version control system.
Date created, date modified and author who last changed the file should be stored in your source control software.
I usually put:
The main purpose of the file and things within the file.
The project/module the file belongs to.
The license associated with the file (and a LICENSE file in the project root).
Who is responsible for the file (either the team, person, or both)
Back in 2002, when I was straight out of college and jobs were few and far between after the dot-com bust, I joined a service company which used to create software customized for their clients in Java. I had to sit in the office of a client (which was a ramshackle room in an electric sub-station rigged with an AC to keep the servers running), sharing chairs/PCs with other guys in the team. The other engineers (if I can call them engineers ;) in the group used to make changes ad-hoc to the source code, compile the files and put them into production.
No way to figure out who made what change.
No way to figure out why any change was made.
No way to go to previous version of code, unless the engineer "remembered" what he modified.
Backup: Copy over files from the production server, which were replaced with new files.
Location of backup: Home directory of engineer copying over files to production server.
Reports of production servers going down due to botched attempts of copying over files to the server (missed a file to be copied over, backups getting lost or wrong files being copied over or not all files being copied over) were met with shrugs (oh no, is it down? let's see what happened; hey who changed what recently...? ummm...).
During those days, after spending several frustrating days trying to figure out the whos and whys behind the code, I had devised a system for comments in a list in the header of the source file which detailed the following:
Date of change made
Who made the change
Why was the change made
Two months later when the list threatened to challenge the size of the source code in the file, the manager had the bright idea of getting a source version control system.
I have never needed to put any comments in headers of source files (except for copyright notices) in any company I worked since. In my current company, everything else is mostly self-evident by looking at the code, or going to the bug reporting system which is integrated with the source version control system.
What fields do you need? If you have to ask whether to put some info there, you don't really need that info. Unless you are forced, by some bureaucratic incompetence of your employer, I don't see why you should go looking for more info than you already feel should be there.\
In most organizations, all source files have to begin with a legal blurb. If you're really lucky, it's just a one-liner, but in most cases it's a really long block of legalese. As a result, few people ever read these. Our eye just travels to the first program element and then goes up to its documentation.
So if you want to write anything, write it in association with the topmost program element, not the file.
Any other bookkeeping information should generally be part of your version control, not maintained (poorly) in the file itself.
In addition to the comment above stating license, the project that it belongs to, etc I also tend to put the "weird" requirements at the top as well (such as "built with version X of library Y") so you, or the person who picks it up after you won't change something that the program relies on without realizing it (or, if they do, they will at least know what to change back)
A lot depends on whether you're using an auto-documentation generation tool or not.
While I agree with many of the comments, if you're using JavaDoc or some other documentation generating tool that depends on comments, you'll obviously need to include the things it wants to see.
You did not mention that you are using a version control system and your comment in Neil N's answer confirms this for your older code. While using version control is the best way to go I also have experienced many situations where the cost of doing so for older code would not be paid for by the project's sponsor. If you do not have a centralized change history for the project then the change history can be put in the modules. It is good that you are using a version control system for your new code.
Your company name
All rights reserved (c) year - or reference to appropriate license
Project or library this file is for
Module it belongs to
Description of what it contains
History
-------
01/08/2010 - Programmer - version
Initial creation.
01/09/2010 - Programmer - version
Change description.
01/10/2010 - Programmer - version
Change description.
Those useful fields that you mentioned are good ones. Who modified the file and when.
Your version control software should allow for the embedding of keywords within comments. For example, in CVS, the $Id$ will resolve to the file, date/time modified, and user that modified the file. It will automatically be kept up to date with each check-in.
Include the following information:
What this file is for. That's a very useful piece of knowledge and it's more important than anything else. You should tell the reader, why there is such a file, why did you group functions in a separate file/package/module and why they are used. Maybe briefly, one or two lines, but that should be there.
Legal stuff, if appplicant.
Leave the place for special commands of console editors, such as of Emacs.
Add special commands that your auto-documenting system requires.
Things things you shouldn not include are
Who created the file
When it was created
Who modified it the last time
When it was last modified
What was added by the latest modification
You can--and should--retrieve it via the version control system, where it's constantly and automatically kept up-to-date. Let alone that most of these points are just useless.
Who created the file
When it was created
Who modified it the last time
When it was last modified
What was added by the latest modification

how are serial generators / cracks developed?

I mean, I always was wondered about how the hell somebody can develop algorithms to break/cheat the constraints of legal use in many shareware programs out there.
Just for curiosity.
Apart from being illegal, it's a very complex task.
Speaking just at a teoretical level the common way is to disassemble the program to crack and try to find where the key or the serialcode is checked.
Easier said than done since any serious protection scheme will check values in multiple places and also will derive critical information from the serial key for later use so that when you think you guessed it, the program will crash.
To create a crack you have to identify all the points where a check is done and modify the assembly code appropriately (often inverting a conditional jump or storing costants into memory locations).
To create a keygen you have to understand the algorithm and write a program to re-do the exact same calculation (I remember an old version of MS Office whose serial had a very simple rule, the sum of the digit should have been a multiple of 7, so writing the keygen was rather trivial).
Both activities requires you to follow the execution of the application into a debugger and try to figure out what's happening. And you need to know the low level API of your Operating System.
Some heavily protected application have the code encrypted so that the file can't be disassembled. It is decrypted when loaded into memory but then they refuse to start if they detect that an in-memory debugger has started,
In essence it's something that requires a very deep knowledge, ingenuity and a lot of time! Oh, did I mention that is illegal in most countries?
If you want to know more, Google for the +ORC Cracking Tutorials they are very old and probably useless nowdays but will give you a good idea of what it means.
Anyway, a very good reason to know all this is if you want to write your own protection scheme.
The bad guys search for the key-check code using a disassembler. This is relative easy if you know how to do this.
Afterwards you translate the key-checking code to C or another language (this step is optional). Reversing the process of key-checking gives you a key-generator.
If you know assembler it takes roughly a weekend to learn how to do this. I've done it just some years ago (never released anything though. It was just research for my game-development job. To write a hard to crack key you have to understand how people approach cracking).
Nils's post deals with key generators. For cracks, usually you find a branch point and invert (or remove the condition) the logic. For example, you'll test to see if the software is registered, and the test may return zero if so, and then jump accordingly. You can change the "jump if equals zero (je)" to "jump if not-equals zero (jne)" by modifying a single byte. Or you can write no-operations over various portions of the code that do things that you don't want to do.
Compiled programs can be disassembled and with enough time, determined people can develop binary patches. A crack is simply a binary patch to get the program to behave differently.
First, most copy-protection schemes aren't terribly well advanced, which is why you don't see a lot of people rolling their own these days.
There are a few methods used to do this. You can step through the code in a debugger, which does generally require a decent knowledge of assembly. Using that you can get an idea of where in the program copy protection/keygen methods are called. With that, you can use a disassembler like IDA Pro to analyze the code more closely and try to understand what is going on, and how you can bypass it. I've cracked time-limited Betas before by inserting NOOP instructions over the date-check.
It really just comes down to a good understanding of software and a basic understanding of assembly. Hak5 did a two-part series on the first two episodes this season on kind of the basics of reverse engineering and cracking. It's really basic, but it's probably exactly what you're looking for.
A would-be cracker disassembles the program and looks for the "copy protection" bits, specifically for the algorithm that determines if a serial number is valid. From that code, you can often see what pattern of bits is required to unlock the functionality, and then write a generator to create numbers with those patterns.
Another alternative is to look for functions that return "true" if the serial number is valid and "false" if it's not, then develop a binary patch so that the function always returns "true".
Everything else is largely a variant on those two ideas. Copy protection is always breakable by definition - at some point you have to end up with executable code or the processor couldn't run it.
The serial number you can just extract the algorithm and start throwing "Guesses" at it and look for a positive response. Computers are powerful, usually only takes a little while before it starts spitting out hits.
As for hacking, I used to be able to step through programs at a high level and look for a point where it stopped working. Then you go back to the last "Call" that succeeded and step into it, then repeat. Back then, the copy protection was usually writing to the disk and seeing if a subsequent read succeeded (If so, the copy protection failed because they used to burn part of the floppy with a laser so it couldn't be written to).
Then it was just a matter of finding the right call and hardcoding the correct return value from that call.
I'm sure it's still similar, but they go through a lot of effort to hide the location of the call. Last one I tried I gave up because it kept loading code over the code I was single-stepping through, and I'm sure it's gotten lots more complicated since then.
I wonder why they don't just distribute personalized binaries, where the name of the owner is stored somewhere (encrypted and obfuscated) in the binary or better distributed over the whole binary.. AFAIK Apple is doing this with the Music files from the iTunes store, however there it's far too easy, to remove the name from the files.
I assume each crack is different, but I would guess in most cases somebody spends
a lot of time in the debugger tracing the application in question.
The serial generator takes that one step further by analyzing the algorithm that
checks the serial number for validity and reverse engineers it.