Any one aware of what format the OLD IBM PTS (personal typing system) used - legacy

I have been trying to write a program to convert my father's old PTS. I assumed it was standard EBCDIC but when I convert to ASCII I get mostly junk. Any one have any knowledge about PTS?

If it's the PS/2 version, it's DOS 3.x & WordPerfect. They were around at a certain large energy company when I worked there, but with the advent of 80286-class workstations & decent laser printers, they went away.
IIRC, EBCDIC never made it off the mainframes & AS400s. I seem to recall some experimental PCs (fuzzy memory), but they died quickly.

Related

develop a game for an old system

I have a school programming project, the subject is free as long as we demonstrate our programming skills, I have an amiga 500 lying around and i wondered if i could make a game for it ? Maybe nothing complicated, i know how limited the system is but is it possible to make it / test it on a windows 10 pc with an emulator and later on burn it onto a magnetic disk ? Also is it possible is i put a comodore disquette in an usb disk reader, to read the code? or is it "too compiled" to learn anything from it ? Thank you !
The Amiga is (was) a nice computer to program on.
Depending on your time budget, learning and coding a game from scratch can be quite time consuming (unless you already are familiar with the 68000 assembly language).
There are plenty of alternatives though, to programm on this computer, using various languages :
C Programming
Using C with either GCC or VBCC and a dedicated (but quite simple and compact) game engine, such as ACE : https://github.com/AmigaPorts/ACE
The Amiga comes with a builtin library of graphical & audio functions (even the good old A500) that might help you to develop a game, in a fully os-friendly way. The result will be rather slow, but you will be able to use every part of the hardware (blitter, copper, bobs, sprites...) : http://amigadev.elowar.com/read/ADCD_2.1/Includes_and_Autodocs_2._guide/node040D.html
Basic
The Amiga has several version of the Basic language, most of them implementing a rather comprehensive set of commands to get the most of its hardware specifications (sprites, bobs...)
You may want to try AMOS Basic, the most famous one : https://www.ultimateamiga.co.uk/index.php/page,16.html
As an alternative, Blitz Basic was probably the second most popular basic on this machine : https://www.amigafuture.de/downloads.php?view=detail&df_id=3663&sid=661dbda78c2a180a20715f7467a95708
68K Assembly
If you really need to get the most of the Amiga while being super close to the metal, and are starting from virtually nothing, then you might want to look at Photon's video tutorials : https://www.youtube.com/watch?v=p83QUZ1-P10&list=PLc3ltHgmiidpK-s0eP5hTKJnjdTHz0_bW
Theses tutos are rather demoscene-oriented, but the visual side of a game dev project is mostly similar to what you will find in a demoscene project (using bobs, sprites, blitter, copper list, playing ProTracker modules...)
My only recommandation, no matter if you have the proper hardware at home or not, is to do most of your work inside an emulator. WinUAE is a near-perfect accurate emulator. You will benefit as well from all the modern tools to edit/version your code (Visual Studio Code, Git...).
Besides the Amiga, I would recommend 2 oldschool-flavored machines (as of 2018) :
The SEGA Megadrive/Genesis, that can be adressed in a super
friendly way using the excellent SGDK library :
https://github.com/Stephane-D/SGDK
The Pico-8, a virtual oldschool console that can(must) be programmed in Lua. Pico-8 can be understood as an emulator for a 8-bit-ish game console that never existed. Its community is really active, though. It can be found here : https://www.lexaloffle.com/pico-8.php
Again, regarding the Megadrive, an emulator might be your best friend.
Once your creation is worth testing on the real machine and if you need to "burn" (copy) your creation on a floppy disk, my recommandation would be get an Amiga 1200 with a PCMCIA/CompactFlash adapter. Using a Gotek (USB to floppy adapter) can be a cheaper alternative that will work on your A500 too (but please don't butcher your Amiga to fit the Gotek in it).
The Megadrive comes with a nice option nowadays, thanks to the Everdrive SD-Card adapter, that allows you to start any ROM file of your own on the real console.
to run/test your Amiga stuff on Windows use WinUAE: http://www.winuae.net/
you can use pretty much any programming language you prefer, but C is probably best supported (many compilers, official docs), and assembler (Motorola MC68000 CPU) gives you total control of the hardware for fancy effects and super efficient coding (fast graphics, ...). There are also (more or less) Amiga specific languages like e.g. AMOS or BlitzBasic/AmiBlitz, which (more or less) are designed to handle Amiga-specific stuff or simplify things (screen handling, sprites, music, hardware access, ...)
if you want to use C, then it's probably best to use Amiga's "official" compiler, if possible: SAS-C (formerly: Lattice-C). If you can't find that one try vbcc, gcc, DiceC, ... (this CD has a full gcc/POSIX development environment (and much more): https://archive.org/details/cdrom-geek-gadgets-2 )
here are a lot of C- and AmigaOS-coding docs: http://amigadev.elowar.com/
a PC floppy disk drive won't read or write your Amiga floppy disk out-of-the-box, you need some additional hardware, try googling some of these: ADTWin, ADF-Copy / ADF-Drive, Disk2FDI, "Arduino Amiga Floppy Disk Reader/Writer"
a nullmodem cable (e.g. nullmodem-serial-USB-adapter cable to PC or Linux box) is a simple way to transfer files from/to Amiga - try google: "amiga SER: nullmodem", or "Amiga Explorer" (a windows application, also lets you directly write .adf disk images from the PC to the Amiga's floppy drive)
well, of course you can learn from old stuff on floppy disks, it just depends on the contents of the disk and your knowhow. ;-) if the disk contains some source code, you're good. more likely, meaning any commercial game etc., it will contain an executable made of machine code - written in assembler, or compiled from C-code, or whatever - or even a custom loader (=non-AmigaDOS disk, no DOS access), for which you need some serious understanding of M68000 instructions, Amiga hardware, and maybe AmigaOS (API). in other words: it's probably "over-compiled", unless you're an assembler guru, or want to become one.
note: as you said you want to write a game for your Amiga 500 - that means we're talking "classic" Amiga: m68k CPU, A500, A1200, A2000, etc., Workbench/Kickstart up to version 3.9/3.1, floppy disks, etc., including modern re-implementations like MiniMIG, Vampire, M5, etc. -
if you're interested in non-classic, probably called "next-generation", Amiga computing, try googling some of these: AmigaOS4, AROS, MorphOS, AmigaONE
Maybe you are looking for a ready game engine
REDPILL
Scorpion Engine
Amiga C Engine

Benefits of cross-platform development?

Are there benefits to developing an application on two or more different platforms? Does using a different compiler on even the same platform have benefits?
Yes, especially if you plan to distribute your code for multiple platforms.
But even if you don't cross platform development is a form of futureproofing; if it runs on multiple (diverse) platforms today, it's more likely to run on future platforms than something that was tuned, tweeked, and specialized to work on a version 7.8.3 clean install of vendor X's Q-series boxes (patch level 1452) and nothing else.
There seems to be a benefit in finding and simply preventing bugs with a different compiler and a different OS. Different CPUs can pin down endian issues early. There is the pain at the GUI level if you want to stay native at that level.
Short answer: Yes.
Short of cloning a disk, it is almost impossible to make two systems exactly alike, so you are going to end up running on "different platforms" whether you meant to or not. By specifically confronting and solving the "what if system A doesn't do things like B?" problem head on you are much more likely to find those key assumptions your code makes.
That said, I would say you should get a good chunk of your base code working on system A, and then take a day (or a week or ...) and get it running on system B. It can be very educational.
My education came back in the 80's when I ported a source level C debugger to over 100 flavors of U*NX. Gack!
Are there benefits to developing an application on two or more different platforms?
If this is production software, the obvious reason is the lure of a larger client base. Your product's appeal is magnified the moment the client hears that you support multiple platforms. Remember, most enterprises do not use a single OS or even a single version of the OS. It is fairly typical to find a section using Windows, another Mac and a smaller version some flavor of Linux.
It is also seen that customizing a product for a single platform is often far more tedious than to have it run on multi-platform. The law of diminishing returns kicks in even before you know.
Of course, all of this makes little sense, if you are doing customization work for an existing product for the client's proprietary hardware. But even then, keep an eye out for the entire range of hardware your client has in his repertoire -- you never know when he might ask for it.
Does using a different compiler on even the same platform have benefits?
Yes, again. Different compilers implement different extensions. See to it that you are not dependent on a particular version of a particular compiler.
Further, there may be a bug or two in the compiler itself. Using multiple compilers helps sort these out.
I have further seen bits of a (cross-platform) product using two different compilers -- one was to used in those modules where floating point manipulation required a very high level of accuracy. (Been a while I've heard anyone else do that, but ...)
I've ported a large C++ program, originally Win32, to Linux. It wasn't very difficult. Mostly dealing with compiler incompatibilities, because the MS C++ compiler at the time was non-compliant in various ways. I expect that problem has mostly gone now (until C++0x features start gradually appearing). Also writing a simple platform abstraction library to centralize the platform-specific code in one place. It depends to what extent you are dependent on services from the OS that would be hard to mimic on a new platform.
You don't have to build portability in from the ground up. That's why "porting" is often described as an activity you can perform in one shot after an initial release on your most important platform. You don't have to do it continuously from the very start. Purely for economic reasons, if you can avoid doing work that may never pay off, obviously you should. The cost of porting later on, when really necessary, turns out to be not that bad.
Mostly, there is an existing platform where the application is written for (individual software). But you adress more developers (both platforms), if you decide to provide an independent language.
Also products (standard software) for SMEs can be sold better if they run on different platforms! You can gain access to both markets, WIN&LINUX! (and MacOSx and so on...)
Big companies mostly buy hardware which is supported/certified by the product vendor only to deploy the specified product.
If you develop on multiple platforms at the same time you get the advantage of being able to use different tools. For example I once had a memory overwrite (I still swear I didn't need the +1 for the null byte!) that cause "free" to crash. I brought the code up to speed on Windows and found the overwrite in about 1 minute with Rational Purify... it had taken me a week under Linux of chasing it (valgrind might have found it... but I didn't know about it at the time).
Different compilers on the same or different platforms is, to me, a must as each compiler will report different things, and sometimes the report from one compiler about an error will be gibberish but the other compiler makes it very clear.
Using things like multiple databases while developing means you are much less likely to tie yourself to a particular database which means you can swap out the database if there is a reason to do so. If you want to integrate something that uses Oracle into a existing infrastructure that uses SQL Server for example it can really suck - much better if the Oracle or SQL Server pieces can be moved to the other system (I know of some places that have 3 different databases for their financial systems... ick).
In general, always developing for two or three things means that the odds of you finding mistakes is better, and the odds of the system being more flexible is better.
On the other hand all of that can take time and effort that, at the immediate time, is seen as an unneeded expense.
Some platforms have really dreadful development tools. I once worked in an IB where rather than use Sun's ghastly toolset, peole developed code in VC++ and then ported to Solaris.

Reverse engineering war stories [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Sometimes you don't have the source code and need to reverse engineer a program or a black box. Any fun war stories?
Here's one of mine:
Some years ago I needed to rewrite a device driver for which I didn't have source code. The device driver ran on an old CP/M microcomputer and drove a dedicated phototypesetting machine through a serial port. Almost no documentation for the phototypesetting machine was available to me.
I finally hacked together a serial port monitor on a DOS PC that mimicked the responses of the phototypesetting machine. I cabled the DOS PC to the CP/M machine and started logging the data coming out of the device driver as I feed data in through the CP/M machine. This enabled me to figure out the handshaking and encoding used by the device driver and re-create an equivalent one for a DOS machine.
Read the story of FCopy for the C-64 here:
Back in the 80s, the Commodore C-64 had an intelligent floppy drive, the 1541, i.e. an external unit that had its own CPU and everything.
The C-64 would send commands to the drive which in turn would then execute them on its own, reading files, and such, then send the data to the C-64, all over a propriatory serial cable.
The manual for the 1541 mentioned, besides the commands for reading and writing files, that one would read and write to its internal memory space. Even more exciting was that one could download 6502 code into the drive's memory and have it executed there.
This got me hooked and I wanted to play with that - execute code on the drive. Of course, there was no documention on what code could be executed there, and which functions it could use.
A friend of mine had written a disassembler in BASIC. and so I read out all its ROM contents, which was 16KB of 6502 CPU code, and tried to understand what it does. The OS on the drive was quite amazing and advanced IMO - it had a kind of task management, with commands being sent from the communication unit to the disk i/o task handler.
I learned enough to understand how to use the disk i/o commands to read/write sectors of the disc. Actually, having read the Apple ]['s DOS 3.3 book which explained all of the workings of its disk format and algos in much detail, was a big help in understanding it all.
(I later learned that I could have also found reserve-eng'd info on the more 4032/4016 disk drives for the "business" Commodore models which worked quite much the same as the 1541, but that was not available to me as a rather disconnected hobby programmer at that time.)
Most importantly, I also learned how the serial comms worked. I realized that the serial comms, using 4 lines, two for data, two for handshake, was programmed very inefficiently, all in software (though done properly, using classic serial handshaking).
Thus I managed to write a much faster comms routine, where I made fixed timing assumtions, using both the data and the handshake line for data transmission.
Now I was able to read and write sectors, and also transmit data faster than ever before.
Of course, it would have been great if one could simply load some code into the drive which speeds up the comms, and then use the normal commands to read a file, which in turn would use the faster comms. This was no possible, though, as the OS on the drive did not provide any hooks for that (mind that all of the OS was in ROM, unmodifiable).
Hence I was wondering how I could turn my exciting findings into a useful application.
Having been a programmer for a while already, dealing with data loss all the times (music tapes and floppy discs were not very realiable back then), I thought: Backup!
So I wrote a backup program which could duplicate a floppy disc in never-before seen speed: The first version did copy an entire 170 KB disc in only 8 minutes (yes, minutes), the second version did it even in about 4.5 minutes. Whereas the apps before mine took over 25 minutes. (Mind you, the Apple ][, which had its disc OS running on the Apple directly, with fast parallel data access, did this all in a minute or so).
And so FCopy for the C-64 was born.
It became soon extremely popular. Not as a backup program as I had intended it, but as the primary choice for anyone wanting to copy games and other software for their friends.
Turned out that a simplification in my code, which would simply skip unreadable sectors, writing a sector with a bad CRC to the copy, did circumvent most of the then-used copy protection schemes, making it possible to copy most formerly uncopyable discs.
I had tried to sell my app and sold it actually 70 times. When it got advertised in the magazines, claiming it would copy a disc in less than 5 minutes, customers would call and not believe it, "knowing better" that it can't be done, yet giving it a try.
Not much later, others started to reverse engineer my app, and optimize it, making the comms even faster, leading to copy apps that did it even in 1.5 minutes. Faster was hardly possible, because, due to the limited amount of memory available on the 1541 and the C-64, you had to swap discs several times in the single disc drive to copy all 170 KB of its contents.
In the end, FCopy and its optimized successors were probably the most-popular software ever on the C-64 in the 80s. And even though it didn't pay off financially for me, it still made me proud, and I learned a lot about reverse-engineering, futility of copy protection and how stardom feels. (Actually, Jim Butterfield, an editor for a C-64 magazine in Canada, told its readers my story, and soon he had a cheque for about 1000 CA$ for me - collected by the magazine from many grateful users sending 5$-cheques, which was a big bunch of money back then for me.)
I actually have another story:
A few years past my FCopy "success" story, I was approached by someone who asked me if I could crack a slot machine's software.
This was in Germany, where almost every pub had one or two of those: You'd throw some money in what amounts to about a US quarter, then it would spin three wheels and if you got lucky with some pattern, you'd then have the choice to "double or nothing" your win on the next play, or get the current win. The goal of the play was to try to double your win a few times until you'd get in the "series" mode where any succeeding win, no matter how minor, would get you a big payment (of about 10 times your spending per game).
The difficulty was to know when to double and when not. For an "outsider" this was completely random, of course. But it turned out that those German-made machines were using simple pseudo-randomized tables in their ROMs. Now, if you watched the machine play for a few rounds, you'd could figure out where this "random table pointer" was and predict its next move. That way, a player would know when to double and when to pass, leading him eventually to the "big win series".
Now, this was already a common thing when this person approached me. There was an underground scene which had access to the ROMs in those machines, find the tables and create software for computers such as a C-64 to use for prediction of the machine's next moves.
Then came a new type of machine, though, which used a different algorithm: Instead of using pre-calc'd tables, it did something else and none of the resident crackers could figure that out. So I was approached, being known as a sort of genius since my FCopy fame.
So I got the ROMs. 16KB, as usual. No information on what it did and how it worked whatsoever. I was on my own. Even the code didn't look familiar (I knew 6502 and 8080 by then only). After some digging and asking, I found it was a 6809 (which I found to be the nicest 8 bit CPU to exist, and which had analogies to the 680x0 CPU design, which was much more linear than the x86 family's instruction mess).
By that time, I had already a 68000 computer (I worked for the company "Gepard Computer" which built and sold such a machine, with its own developer OS and, all) and was into programming Modula-2. So I wrote a disassembler for the 6809, one that helped me with reverse engineering by finding subroutines, jumps, etc. Slow I got a an idea of the flow control of the slot machine's program. Eventually I found some code that looked like a mathmatical algorithm and it dawned on me that this could be the random generating code.
As I never had a formal education in computer sciences, up to then I had no idea how a typical randomgen using mul, add and mod worked. But I remember seeing something mentioned in a Modula-2 book and then realized what it was.
Now I could quickly find the code that would call this randomgen and learn which "events" lead to a randomgen iteration, meaning I knew how to predict the next iterations and their values during a game.
What was left was to figure out the current position of the randomgen. I had never been good with abstract things such as algebra. I knew someone who studied math and was a programmer too, though. When I called him, he quickly knew how to solve the problem and quabbled a lot about how simple it would be to determine the randomgen's seed value. I understood nothing. Well, I understood one thing: The code to accomplish this would take a lot of time, and that a C-64 or any other 8 bit computer would take hours if not days for it.
Thus, I decided to offer him 1000 DM (which was a lot of money for me back then) if he could write me an assembler routine in 68000. Didn't take him long and I had the code which I could test on my 68000 computer. It took usually between 5 and 8 minutes, which was acceptable. So I was almost there.
It still required a portable 68000 computer to be carried to the pub where the slot machine stands. My Gepard computer was clearly not of the portable type. Luckly, someone else I knew in Germany produced entire 68000 computers on a small circuit board. For I/O it only had serial comms (RS-232) and a parallel port (Centronics was the standard of those days). I could hook up some 9V block battieries to it to make it work. Then I bought a Sharp pocket computer, which had a rubber keyboard and a single-line 32 chars display. Running on batteries, which was my terminal. It had a RS-232 connector which I connected to the 68000 board. The Sharp also had some kind of non-volatile memory, which allowed me to store the 68000 random-cracking software on the Sharp, transfer it on demand to the 68000 computer, which then calculated the seed value. Finally I had a small Centronics printer which printed on narrow thermo paper (which was the size of what cash registers use to print receipts). Hence, once the 68000 had the results, it would send a row of results for the upcoming games on the slot machine to the Sharp, which printed them on paper.
So, to empty one of these slot machines, you'd work with two people: You start playing, write down its results, one you had the minimum number of games required for the seed calculation, one of you would go to the car parked outside, turn on the Sharp, enter the results, it would have the 68000 computer rattle for 8 minutes, and out came a printed list of upcoming game runs. Then all you needed was this tiny piece of paper, take it back to your buddy, who kept the machine occupied, align the past results with the printout and no more than 2 minutes later you were "surprised" to win the all-time 100s series. You'd then play these 100 games, practically emptying the machine (and if the machine was empty before the 100 games were played, you had the right to wait for it to be refilled, maybe even come back next day, whereas the machine was stopped until you came back).
This wasn't Las Vegas, so you'd only get about 400 DM out of a machine that way, but it was quick and sure money, and it was exciting. Some pub owners suspected us of cheating but had nothing against us due to the laws back then, and even when some called the police, the police was in favor of us).
Of course, the slot making company soon got wind of this and tried to counteract, turning off those particular machines until new ROMs were installed. But the first few times they only changed the randomgen's numbers. We only had to get hold of the new ROMs, and it took me a few minutes to find the new numbers and implement them into my software.
So this went on for a while during which me and friends browsed thru pubs of several towns in Germany looking for those machines only we could crack.
Eventually, though, the machine maker learned how to "fix" it: Until then, the randomgen was only advanced at certain predictable times, e.g. something like 4 times during play, and once more per the player's pressing of the "double or nothing" button.
But then they finally changed it so that the randomgen would continually be polled, meaning we were no longer able to predict the next seed value exactly on time for the pressing of the button.
That was the end of it. Still, making the effort of writing a disassembler just for this single crack, finding the key routines in 16KB of 8 bit CPU code, figuring out unknown algorithms, investing quite a lot of money to pay someone else to develop code I didn't understand, finding the items for a portable high-speed computer involving the "blind" 68000 CPU with the Sharp as a terminal and the printer for the convenient output, and then actually emptying the machines myself, was one of the most exciting things I ever did with my programming skills.
Way back in the early 90s, I forgot my Compuserve password. I had the encrypted version in CIS.INI, so I wrote a small program to do a plaintext attack and analysis in an attempt to reverse-engineer the encryption algorithm. 24 hours later, I figured out how it worked and what my password was.
Soon after that, I did a clean-up and published the program as freeware so that Compuserve customers could recover their lost passwords. The company's support staff would frequently refer these people to my program.
It eventually found its way onto a few bulletin boards (remember them?) and Internet forums, and was included in a German book about Compuserve. It's still floating around out there somewhere. In fact, Google takes me straight to it.
Once, when playing Daggerfall II, I could not afford the Daedric Dai-Katana so I hex-edited the savegame.
Being serious though, I managed to remove the dongle check on my dads AutoCAD installation using SoftICE many years ago. This was before the Internet was big. He works as an engineer so he had a legitimate copy. He had just forgotten the dongle at his job and he needed to do some things and I thought it would be a fun challenge. I was very proud afterwards.
Okay, this wasn't reverse engineering (quite) but a simple hardware hack born of pure frustration. I was an IT manager for a region of Southwestern Bell's cell phone service in the early 90s. My IT department was dramatically underfunded and so we spent money on smart people rather than equipment.
We had a WAN between major cities, used exclusively for customer service, with critical IP links. Our corporate bosses were insistent that we install a network monitoring system to notify us when the lines went down (no money for redundancy, but spend bucks for handling failures. Sigh.)
The STRONGLY recommended solution ran on a SPARC workstation and started at $30K plus the cost of a SPARC station (around $20K then), together which was a substantial chunk of my budget. I couldn't see it - this was a waste of $$. So I decided a little hacking was in order.
I took an old PC scheduled for destruction and put a copy of ProComm (remember ProComm?) and had it ping each of the required nodes along the route (this was one of the later versions of ProComm that scripted FTP as well as serial lines, KERMIT, etc.) A little logic in the coding fired-off a pager message when a node couldn't be reached. I had already used it to cobble together a pager system for our techs, so I reused the pager code. The script ran continuously, sending a ping once each minute across each of the critical links, and branched into the pager code when a ping wasn't returned.
We duplicated this system at each critical location for a cost of less than $500 and had a very fast notification when a link went down. Next issue - one of our first trouble-shooting methods was to power-cycle our routers and/or terminal servers. I got some dial-up X10 controllers and a few X10 on/off appliance power switches. You had to know the correct phone number to use and the correct tones to push, but we printed up a cheat card for each technician and they kept it with their pager. Instant fast response! One of my techs then programmed the phones we all had to reset specific equipment at each site as a speed-dial. One-tech solves the problem!
Now the "told-you-so" unveiling.
I'm sitting at lunch with our corporate network manager in Dallas who is insisting on a purchase of the Sun-based network management product. I get a page that one of our links is down, and then a second page. Since the pager messages are coming from two different servers, I know exactly which router is involved (it was a set-up, I knew anyway, as the tech at the meeting with me was queued to "down a router" during the meal so we could show-off.) I show the pager messages to the manager and ask him what he would do to resolve this problem. He eyes me suspiciously, since he hasn't yet successfully been paged by his Solaris NMS system that is supposed to track critical links. "Well, I guess you'd better call a tech and get them to reset the router and see if that fixes it." I turned to the tech who was lunching with us and asked him to handle it. He drew out his cell phone (above the table this time) and pressed the speed-dial he had programmed to reset the router in question. The phone dialed the X10 switch, told it to power-off the router, paused for five seconds, told it to power-up the router, and disconnected. Our ProComm script sent us pages telling us the link was back up within three minutes of this routine. :-)
The corporate network manager was very impressed. He asked me what the cost was for my new system. When I told him less than $1K, he was apoplectic. He had just ordered a BIG set of the Sun Solaris network management solution just for the tasks I'd illustrated. I think he had spent something like $150K. I told him how the magic was done and offered him the ProComm script for the price of lunch. TANSTAAFL. He told me he'd buy me lunch to keep my mouth shut.
Cleaning out my old drawers of disks and such, I found a copy of the code - "Pingasaurus Rex" was the name I had given it. That was hacking in the good old days.
The most painful for me was for this product where we wanted to include an image on a Excel Spreadsheet (few years back before the open standards). So I had to get and "understanding" if such thing exists of the internal format for the docs, as well. I ended up doing some Hex comparison between files with and without the image to figure out how to put it there, plus working on some little endian math....
I once worked on a tool that would gather inventory information from a PC as it logged into the network. The idea was to keep track of all the PCs in your company.
We had a new requirement to support the Banyan VINES network system, now long forgotten but pretty cool at the time it came out. I couldn't figure out how to get the Ethernet MAC address from Banyan's adapter as there was no documented API to do this.
Digging around online, I found a program that some other Banyan nerd had posted that performed this exact action. (I think it would store the MAC address in an environment variable so you could use it in a script). I tried writing to the author to find out how his program worked, but he either didn't want to tell me or wanted some ridiculous amount of money for the information (I don't recall).
So I simply fired up a disassembler and took his utility apart. It turned out he was making one simple call to the server, which was an undocumented function code in the Banyan API. I worked out the details of the call pretty easily, it was basically asking the server for this workstations address via RPC, and the MAC was part of the Banyan network address.
I then simply emailed the engineers at Banyan and told them what I needed to do. "Hey, it appears that RPC function number 528 (or whatever) returns what I need. Is that safe to call?"
The Banyan engineers were very cool, they verified that the function I had found was correct and was pretty unlikely to go away. I wrote my own fresh code to call it and I was off and running.
Years later I used basically the same technique to reverse engineer an undocumented compression scheme on an otherwise documented file format. I found a little-known support tool provided by the (now defunct) company which would decompress these files, and reverse engineered it. It turned out to be a very straightforward Lempel-Ziv variant applied within the block structure of their file format. The results of that work are recorded for posterity in the Wireshark source code, just search for my name.
Almost 10 years ago, I picked up the UFO/XCOM Collector's Edition in the bargain bin at a local bookstore, mostly out of nostalgia. When I got back home, I got kinda excited that it had been ported to Windows (the DOS versions didn't run under win2k)... and then disappointed that it had garbled graphics.
I was about to shrug my shoulders (bargain bin and all), but then my friend said "Haven't you... bugfixed... software before?", which led to a night of drinking lots of cola and reverse-engineering while hanging out with my friend. In the end, I had written a bugfix loader that fixed the pitch vs. width issue, and could finally play the two first XCOM games without booting old hardware (DOSBOX wasn't around yet, and my machine wasn't really powerful enough for fullblown virtualization).
The loader gained some popularity, and was even distributed with the STEAM re-release of the games for a while - I think they've switched to dosbox nowadays, though.
I wrote a driver for the Atari ST that supported Wacom tablets. Some of the Wacom information could be found on their web sites, but I still had to figure out a lot on my own.
Then, once I had written a library to access the wacom tables (and a test application to show the results) - it dawned on me that there was no API for the OS (GEM windowing system) to actually place the mouse cursor somewhere. I ended up having to hook some interrupts in something called the VDI (like GDI in windows), and be very very careful not to crash the computer inside there. I had some help (in the form of suggestions) from the developers of an accelerated version of the VDI (NVDI), and everything was written in PurePascal. I still occasionally have people asking me how to move the mouse cursor in GEM, etc.
I've had to reverse engineer a video-processing app, where I only had part of the source code. It took me weeks and weeks to even work out the control-flow, as it kept using CORBA to call itself, or be called from CORBA in some part of the app that I couldn't access.
Sheer idiocy.
I recently wrote an app that download the whole content from a Domino Webmail server using Curl. This is because the subcontractor running the server asks for a few hundred bucks for every archive request.
They changed their webmail version about one week after I released the app for the departement but managed to make it working again using a GREAT deal of regex and XML
When I was in highschool they introduced special hours each week (there were 3 hours if i recall correctly) in which we had to pick a classroom with a teacher there to help with any questions on their subject. No of course everybody always wanted to spend their time in the computer room to play around on the computers there.
To choose the room where you would have to be there was an application that would monitor how many students would go to a certain room, and so you had to reserve your slot on time or otherwise there was not much choice where to go.
At that time I always liked to play around on the computers there and I had obtained administrator access already, only for this application that would not help me so much. So I used my administrator access to make a copy of the application and take it home to examine. Now I don't remember all the details but I discoverd this application used some access database file located on a hidden network share. Then after taking a copy of this database file I found there was a password on the database. Using some linux access database tools I could easily work around that and after that it was easy to import this database in my own mysql server.
Then though a simple web interface I could find details for every student in the school to change their slots and promote myself to sit in my room of choice every time.
The next step was to write my own application that would allow me to just select a student from the list and change anything without having to look up their password which was implemented in only a few hours.
While not such a very impressive story as some others in this thread, I still remember it was a lot of fun to do for a highschool child back then.

How would you go about reverse engineering a set of binary data pulled from a device?

A friend of mine brought up this questiont he other day, he's recently bought a garmin heart rate moniter device which keeps track of his heart rate and allows him to upload his heart rate stats for a day to his computer.
The only problem is there are no linux drivers for the garmin USB device, he's managed to interpret some of the data, such as the model number and his user details and has identified that there are some binary datatables essentially which we assume represent a series of recordings of his heart rate and the time the recording was taken.
Where does one start when reverse engineering data when you know nothing about the structure?
I had the same problem and initially found this project at Google Code that aims to complete a cross-platform version of tools for the Garmin devices ... see: http://code.google.com/p/garmintools/. There's a link on the front page of that project to the protocols you need, which Garmin was thoughtful enough to release publically.
And here's a direct link to the Garmin I/O specification: http://www.garmin.com/support/pdf/IOSDK.zip
I'd start looking at the data in a hexadecimal editor, hopefully a good one which knows the most common encodings (ASCII, Unicode, etc.) and then try to make sense of it out of the data you know it has stored.
As another poster mentioned, reverse engineering can be hairy, not in practice but in legality.
That being said, you may be able to find everything related to your root question at hand by checking out this project and its' code...and they do handle the runner's heart rate/GPS combo data as well
http://www.gpsbabel.org/
I'd suggest you start with checking the legality of reverse engineering in your country of origin. Most countries have very strict laws about what is allowed and what isn't regarding reverse engineering devices and code.
I would start by seeing what data is being sent by the device, then consider how such data could be represented and packed.
I would first capture many samples, and see if any pattern presents itself, since heart beat is something which is regular and that would suggest it is measurement related to the heart itself. I would also look for bit fields which are monotonically increasing, as that would suggest some sort of time stamp.
Having formed a hypothesis for what is where, I would write a program to test it and graph the results and see if it makes sense. If it does but not quite, then closer inspection would probably reveal you need some scaling factors here or there. It is also entirely possible I need to process the data first before it looks anything like what their program is showing, i.e. might need to integrate the data points. If I get garbage, then it is back to the drawing board :-)
I would also check the manufacturer's website, or maybe run strings on their binaries. Finding someone who works in the field of biomedical engineering would also be on my list, as they would probably know what protocols are typically used, if any. I would also look for these protocols and see if any could be applied to the data I am seeing.
I'd start by creating a hex dump of the data. Figure it's probably blocked in some power-of-two-sized chunks. Start looking for repeating patterns. Think about what kind of data they're probably sending. Either they're recording each heart beat individually, or they're recording whatever the sensor is sending at fixed intervals. If it's individual beats, then there's going to be a time delta (since the last beat), a duration, and a max or avg strength of some sort. If it's fixed intervals, then it'll probably be a simple vector of readings. There'll probably be a preamble of some sort, with a start timestamp and the sampling rate. You can try decoding the timestamp yourself, or you might try simply feeding it to ctime() and see if they're using standard absolute time format.
Keep in mind that lots of cheap A/D converters only produce 12-bit outputs, so your readings are unlikely to be larger than 16 bits (and the high-order 4 bits may be used for flags). I'd recommend resetting the device so that it's "blank", dumping and storing the contents, then take a set of readings, record the results (whatever the device normally reports), then dump the contents again and try to correlate the recorded results with whatever data appeared after the "blank" dump.
Unsure if this is what you're looking for but Garmin has created an API that runs with your browser. It seems OSX is supported, as well as Windows browsers... I would try it from Google Chromium to see if it can be used instead of this reverse engineering...
http://developer.garmin.com/web-device/garmin-communicator-plugin/
API Features
Auto-detection of devices connected to a computer Access to device
product information like product name and software version Read
tracks, routes and waypoints from supported recreational, fitness and
navigation devices Write tracks, routes and waypoints to supported
recreational, fitness and navigation devices Read fitness data from
supported fitness devices Geo-code address and save to a device as a
waypoint or favorite Read and write Garmin XML files (GPX and TCX) as
well as binary files. Support for most Garmin devices (USB, USB
mass-storage, most serial devices) Support for Internet Explorer,
Firefox and Chrome on Microsoft Windows. Support for Safari, Firefox
and Chrome on Mac OS X.
Can you synthesize a heart beat using something like a computer speaker? (I have no idea how such devices actually work). Watch how the binary results change based on different inputs.
Ripping apart the device and checking out what's inside would probably help too.

Internationalization in your projects

How have you implemented Internationalization (i18n) in actual projects you've worked on?
I took an interest in making software cross-cultural after I read the famous post by Joel, The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). However, I have yet to able to take advantage of this in a real project, besides making sure I used Unicode strings where possible. But making all your strings Unicode and ensuring you understand what encoding everything you work with is in is just the tip of the i18n iceberg.
Everything I have worked on to date has been for use by a controlled set of US English speaking people, or i18n just wasn't something we had time to work on before pushing the project live. So I am looking for any tips or war stories people have about making software more localized in real world projects.
It has been a while, so this is not comprehensive.
Character Sets
Unicode is great, but you can't get away with ignoring other character sets. The default character set on Windows XP (English) is Cp1252. On the web, you don't know what a browser will send you (though hopefully your container will handle most of this). And don't be surprised when there are bugs in whatever implementation you are using. Character sets can have interesting interactions with filenames when they move to between machines.
Translating Strings
Translators are, generally speaking, not coders. If you send a source file to a translator, they will break it. Strings should be extracted to resource files (e.g. properties files in Java or resource DLLs in Visual C++). Translators should be given files that are difficult to break and tools that don't let them break them.
Translators do not know where strings come from in a product. It is difficult to translate a string without context. If you do not provide guidance, the quality of the translation will suffer.
While on the subject of context, you may see the same string "foo" crop up in multiple times and think it would be more efficient to have all instances in the UI point to the same resource. This is a bad idea. Words may be very context-sensitive in some languages.
Translating strings costs money. If you release a new version of a product, it makes sense to recover the old versions. Have tools to recover strings from your old resource files.
String concatenation and manual manipulation of strings should be minimized. Use the format functions where applicable.
Translators need to be able to modify hotkeys. Ctrl+P is print in English; the Germans use Ctrl+D.
If you have a translation process that requires someone to manually cut and paste strings at any time, you are asking for trouble.
Dates, Times, Calendars, Currency, Number Formats, Time Zones
These can all vary from country to country. A comma may be used to denote decimal places. Times may be in 24hour notation. Not everyone uses the Gregorian calendar. You need to be unambiguous, too. If you take care to display dates as MM/DD/YYYY for the USA and DD/MM/YYYY for the UK on your website, the dates are ambiguous unless the user knows you've done it.
Especially Currency
The Locale functions provided in the class libraries will give you the local currency symbol, but you can't just stick a pound (sterling) or euro symbol in front of a value that gives a price in dollars.
User Interfaces
Layout should be dynamic. Not only are strings likely to double in length on translation, the entire UI may need to be inverted (Hebrew; Arabic) so that the controls run from right to left. And that is before we get to Asia.
Testing Prior To Translation
Use static analysis of your code to locate problems. At a bare minimum, leverage the tools built into your IDE. (Eclipse users can go to Window > Preferences > Java > Compiler > Errors/Warnings and check for non-externalised strings.)
Smoke test by simulating translation. It isn't difficult to parse a resource file and replace strings with a pseudo-translated version that doubles the length and inserts funky characters. You don't have to speak a language to use a foreign operating system. Modern systems should let you log in as a foreign user with translated strings and foreign locale. If you are familiar with your OS, you can figure out what does what without knowing a single word of the language.
Keyboard maps and character set references are very useful.
Virtualisation would be very useful here.
Non-technical Issues
Sometimes you have to be sensitive to cultural differences (offence or incomprehension may result). A mistake you often see is the use of flags as a visual cue choosing a website language or geography. Unless you want your software to declare sides in global politics, this is a bad idea. If you were French and offered the option for English with St. George's flag (the flag of England is a red cross on a white field), this might result in confusion for many English speakers - assume similar issues will arise with foreign languages and countries. Icons need to be vetted for cultural relevance. What does a thumbs-up or a green tick mean? Language should be relatively neutral - addressing users in a particular manner may be acceptable in one region, but considered rude in another.
Resources
C++ and Java programmers may find the ICU website useful: http://www.icu-project.org/
Some fun things:
Having a PHP and MySQL Application that works well with German and French, but now needs to support Russian and Chinese. I think I move this over to .net, as PHP's Unicode support is - in my opinion - not really good. Sure, juggling around with utf8_de/encode or the mbstring-functions is fun. Almost as fun as having Freddy Krüger visit you at night...
Realizing that some languages are a LOT more Verbose than others. German is a LOT more verbose than English usually, and seeing how the German Version destroys the User Interface because too little space was allocated was not fun. Some products gained some fame for their creative ways to work around that, with Oblivion's "Schw.Tr.d.Le.En.W." being memorable :-)
Playing around with date formats, woohoo! Yes, there ARE actually people in the world who use date formats where the day goes in the middle. Sooooo much fun trying to find out what 07/02/2008 is supposed to mean, just because some users might believe it could be July 2... But then again, you guys over the pond may believe the same about users who put the month in the middle :-P, especially because in English, July 2 sounds a lot better than 2nd of July, something that does not neccessarily apply to other languages (i.e. in German, you would never say Juli 2 but always Zweiter Juli). I use 2008-02-07 whenever possible. It's clear that it means February 7 and it sorts properly, but dd/mm vs. mm/dd can be a really tricky problem.
Anoter fun thing, Number formats! 10.000,50 vs 10,000.50 vs. 10 000,50 vs. 10'000,50... This is my biggest nightmare right now, having to support a multi-cultural environent but not having any way to reliably know what number format the user will use.
Formal or Informal. In some language, there are two ways to address people, a formal way and a more informal way. In English, you just say "You", but in German you have to decide between the formal "Sie" and the informal "Du", same for French Tu/Vous. It's usually a safe bet to choose the formal way, but this is easily overlooked.
Calendars. In Europe, the first day of the Week is Monday, whereas in the US it's Sunday. Calendar Widgets are nice. Showing a Calendar with Sunday on the left and Saturday on the right to a European user is not so nice, it confuses them.
I worked on a project for my previous employer that used .NET, and there was a built in .resx format we used. We basically had a file that had all translations in the .resx file, and then multiple files with different translations. The consequence of this is that you have to be very diligent about ensuring that all strings visible in the application are stored in the .resx, and anytime one is changed you have to update all languages you support.
If you get lazy and don't notify the people in charge of translations, or you embed strings without going through your localization system, it will be a nightmare to try and fix it later. Similarly, if localization is an afterthought, it will be very difficult to put in place. Bottom line, if you don't have all visible strings stored externally in a standard place, it will be very difficult to find all that need to be localized.
One other note, very strictly avoid concatenating visible strings directly, such as
String message = "The " + item + " is on sale!";
Instead, you must use something like
String message = String.Format("The {0} is on sale!", item);
The reason for this is that different languages often order the words differently, and concatenating strings directly will need a new build to fix, but if you used some kind of string replacement mechanism like above, you can modify your .resx file (or whatever localization files you use) for the specific language that needs to reorder the words.
I was just listening to a Podcast from Scott Hanselman this morning, where he talks about internationalization, especially the really tricky things, like Turkish (with it's four i's) and Thai. Also, Jeff Atwood had a post:
Besides all the previous tips, remember that i18n it's not just about changing words for their equivalent on other languages, especially for non-latin languages alphabets (korean, Arabic) which written right to left, so the whole UI will have to conform, like
item 1
item 2
item 3
would have to be
arabic text 1 -
arabic text 2 -
arabic text 3 -
(reversed bullet list doesn't seem to work :P)
which can be a UI nightmare if your system has to apply changes dinamically once the user changes the language being used.
Another very hard thing is to test different languages, not just for the correctness of word, but since languages like Korean usually have bigger font type for their characters this may lead to language specific bugs (like "SAVE" text on a button being larger than the button itself for some language).
One of the funnier things to discover: italics and bold text makrup does not work with CJK (Chinese/Japanese/Korean) characters. They simply become unreadable. (OK, I couldn't really read them before either, but especially bolding just creates ink blots)
I think everyone working in internationalization should be familiar with the Common Locale Data Repository, which is now a sub-project of Unicode:
Common Locale Data Repository
Those folks are working hard to establish a standard resource for all kinds of i18n issues: currency, geographical names, tons of stuff. Any project that's maintaining its own core local data given that this project exists is pretty bonkers, IMHO.
I suggest to use something like 99translations.com to maintain your translations . Otherwise you won't be able to tell what of your translations are up to date in every language.
Another challenge will be accepting input from your users. In many cases, this is eased by the input processing provided by the operating system, such as IME in Windows, which works transparently with common text widgets, but this facility will not be available for every possible need.
One website I use has a translation method the owner calls "wiki + machine translation". This is a community based site so is obviously different to the needs of companies.
http://blog.bookmooch.com/2007/09/23/how-bookmooch-does-its-translations/
One thing no one have mentioned yet is strings with some warying part as in "The unit will arive in 5 days" or "On Monday something happens." where 5 and Monday will change depending on state. It is not a good idea to split those in two and concatenate them. With only one varying part and good documentation you might get away with it, with two varying parts there will be some language that preferes to change the order of them.