Should a warning go to stdout or stderr? - warnings

Should a warning message go to stdout or stderr?
Is this somewhere defined (e.g. in Posix)?
This question is doesn't related to a specific programming language or machine architecture.

It depends. For example, suppose you're implementing a sort program. Your program reads its input from stdin and writes to stdout. Suppose you add an option to do a "numeric" sort instead of a lexicographic sort (the details aren't important). Suppose that you want to write a warning message if there aren't any numbers actually found in the input. Your sort will still work, but it might not be what the user intended. In that case, you would want to write your warning message to stderr, because stdout is reserved for the actual sorted data.
On the other hand, if you're writing a program to contact a web service API, read some JSON, extract some info, and write the info to a database, then perhaps writing a warning message to stdout is perfectly acceptable.
So, it depends on what your application is and how it uses stdout. There is no "standard".

Related

Using write access in Open command in TCL

How can i use write ('w') and read ('r') access while using command pipeline in open command in TCL.
when i do something like :
set f1 [open "| ls -l" w]
it returns a file descriptor to write to , say file1.
Now I am confused how can I put this file descriptor to my use.
PS : My example might be wrong, and in that case it'd be ideal if answer includes a programming example so that it'll be more clear.
Thanks
In general, the key things you can do with a channel are write to it (using puts), read from it (using gets and read), and close it. Obviously, you can only write to it if it is writable, and only read from it if it is readable.
When you write to a channel that is implemented as a pipeline, you send data to the program on the other end of the pipe; that's usually consuming it as its standard input. Not all programs do that; ls is one of the ones that completely ignores its standard input.
But the other thing you can do, as I said above, is close the channel. When you close a pipeline, Tcl waits for all the subprocesses to terminate (if they haven't already) and collects their standard error output, which becomes an error message from close if there is anything. (The errors are just like those you can get from calling exec; the underlying machinery is shared.)
There's no real point in running ls in a pure writable pipeline, at least not unless you redirect its output. Its whole purpose is to produce output (the sorted list of files, together with extra details with the -l option). If you want to get the output, you'll need a readable channel (readable from the perspective of Tcl): open "| ls -l" r. Then you'll be able to use gets $f1 to read a line from the subprocess.
But since ls is entirely non-interactive and almost always has a very quick running time (unless your directories are huge or you pass the options to enable recursion), you might as well just use exec. This does not apply to other programs. Not necessarily anyway; you need to understand what's going on.
If you want to experiment with pipelines, try using sort -u as the subprocess. That takes input and produces output, and exhibits all sorts of annoying behaviour along the way! Understanding how to work with it will teach you a lot about how program automation can be tricky despite it really being very simple.

How to use printf to debug TCL source v8.4

I am using TCL as an embedded control in my system and I need to modify its core source a bit, I mean the code under generic/, such as tclInterp.c. I am adding printf to the source code to trace my modification, but for some reason I cannot see the output. I see the code is using fprintf, I used that and tried both stdout and stderr, still not working.
I already added "--enable-symbols=all" to run configure and re-build the packages. Is there anything else I need to do?
You should use a debugger instead. Adding printf statements to the core code will result in your output appearing on stdout which your Tcl scripts may redirect. using fprintf(stderr, ...) might be less likely to clash with the scripts you run. --enable-symbols just results in a debuggable build - it will not affect the ability to write to stdout but will result in a debugger being able to produce meaningful output.
You don't say - but if you are on Windows and are embedded in a graphical program then you probably don't have stdout anyway. On Windows, you will be best to use OutputDebugString and watch the messages in Visual Studio's output window or sysinternals DbgView.
On unix, the console you launch the application should show the output. However, actually tracing your mods with a debugger will be the best route.
Are you sure you really need to modify the core? Seems unlikely to me. Normally you just add additional commands to the interpreter to provide the interface to your hosting application. The Tcl API offers access to pretty much everything you might reasonably want to fiddle with.

best practices for writing to a file from multiple methods

I have a class that contains a bunch of methods for checking data I scrape every week (for things like well-formedness and other errors in gathering the data). Each of these methods performs a test, and then prints out a summary of the test.
I want to print out the output from these tests to a file, but I'm not sure what the best way to do it is. For example...
Should the class hold an instance variable to the file, and each method open/appends/closes the file? (A problem is that methods sometimes call other methods, so this seems kinda messy?)
Should each method get passed the file as a parameter? (Seems messy as well.)
Should each method return a string, and a"central" method that calls all the other tests outputs all these strings to a file?
I'm not really familiar with using logger libraries -- would that be a solution?
My particular context
I have a scraper that pulls data from various websites and stores them in a database. Websites change all the time, so I'm writing a "scrape checker" program that checks my scrapes for various things, like:
number of empty results
length of results
weird characters in results
and so on
So I have methods like:
check_num_empty_results
check_weird_characters
check_scrape (calls a bunch of other checks)
check_scrape_pair (sometimes I want to check pairs of scrapes together, e.g., to match results against each other, so this is different checking each one in isolation)
etc.
I want my "scrape checker" program to print out a file that summarizes all the checks.
Separation of concerns. Write code the focuses on the scraping activity and return the value(s) scraped. Then use aspect oriented programming for logging, which can simplify the problem greatly as the aspect holds the reference to the file or logging API.
Ultimately, it depends on what language you're using.
The first solution makes the most sense if your language permits it. For each instance of the logging class, have a field for the file object that you're reading from/writing to. This is basically equivalent to passing the file object as a parameter to every method.
That said, most mature languages have modules that will do a lot of this work for you; off the top of my sh/awk, Perl, and Python all come to mind as being suited to this task (though if you want to, you could use Java or something else).
Seems like a logging framework would be a perfect solution for this. If you are using Java or .NET, log4j and log4net are pretty much the de-facto standards for that.

How to analyze binary file?

I have a binary file. I don't know how it's formatted, I only know it comes from a delphi code.
Does it exist any way to analyze a binary file?
Does it exist any "pattern" to analyze and deserialize the binary content of a file with unknown format?
Try these:
Deserialize data: analyze how it's compiled your exe (try File Analyzer). Try to deserialize the binary data with the language discovered. Then serialize it in a xml format (language-indipendent) that every programming language can understand
Analyze the binary data: try to save various versions of the file with little variation and use a diff program to analyze the meaning of every bit with an hex editor. Use it in conjunction with binary hacking techniques (like How to crack a Binary File Format by Frans Faase)
Reverse Engineer the application: try getting code using reverse engineering tools for the programming language used for build the app (found with File Analyzer). Otherwise use disassembler analysis tool like IDA Pro Disassembler
For my hobby project I had to reverse engineer some old game files. My approaches were:
Have a good hex editor.
Look for readable words in the binary file. Note how their distribution is. If the distance between them is constant you know it is a listing.
Look for 2-3 consequent zeros. Might indicate an int32 value.
Some dwords might be pointers into the file.
Try to identify reoccurring patterns in the file.
Seeing lots of C0-CF might indicate RLE compressed data.
I've developed Hexinator (Window & Linux) and Synalyze It! (macOS) exactly for this purpose. These applications allow you to see the binary files like in other hex editors but additionally you can create a "grammar" with the specifics of a binary file format. The grammar contains all the building blocks and is used to parse the file automatically.
Thus you can keep the knowledge you gain in the analysis and apply it to multiple files simultaneously. You can also color-code the bits and pieces of file formats for a quick overview in the hex editor.
The parsing results are displayed in a tree view where you can also modify the files easily (applying endianness et cetera).
Reverse engineering a binary file when you have some idea of what it represents is a very time consuming process. If you have no idea what it is then it will be even harder.
It is possible though, but you have to have a pretty good reason for doing so.
The first step would be to open it up in a hex editor of your choice and see if you can find any English text to point you in the direction of what the file is even supposed to represent. From there, Google "Reverse Engineering binary files", there are much more knowledgeable people than me that have written guides about it.
The "strings" program from GNU binutils is very useful. It will print the strings of printable characters in a file, quite often giving a clue to what a file contains or a program does.
If the data represents serialized Delphi objects, you should start reading about the Delphi serialization process. If that's the case, I think your best bet would be to load it using Delphi and continue your analysis from the IDE. Some informations about Delphi serialization can be found here.
EDIT: if the file does contain serialized delphi objects, then you should write a small delphi program that loads it, and "convert" the data yourself to something neutral, like xml. If you manage to do this, you should check and see if delphi supports serializing to xml. Then, you could access those objects from any language.
The unix "file" command is really useful - I don't know if there is anything like it in windows. You run it like this:
file myfile.ext
And it spits out a text description based on the magic numbers and data contained therein.
Probably it is contained within cygwin.
If you have access to the application that creates the file, you can apply changes to the application, then save the file and see the effects (Keep in mind that numbers are probably stored in little endian):
First create the file repeatedly. If the files are not binary equal, the current date/time is probably stored in the area where hte differences occur.
Maybe you want to repeat that with the software running under different environments, to see if OS version etc are stored, but this is rather unusual.
Next you can try to change single variables and create several files that only differ in the value of this variable. This helps you identify where this variable is stored.
That way you can also exclude variables that are not stored in the file: If you change them, but the files created are identical, they are not stored.
In order to test the hypotheses you worked out with the steps above, edit one of the files and have the application read it.
If you don't have access to the application itself, I suggest that you forget about it and find another way to solve your problem. There is a very high probability that it will be faster...
If file does not give a meaningful answer, you may want to try TRiD by Marco Pontello to determine whether your data is stored in a known format.
Get the Delphi application and open it in IDA Pro freeware version, and find where it writes the file, and decode how it writes the file that way.
Unless it's plan text.
Do you know the program that uses it? If so you can hook that programs write to file function and get an idea of what data its writing, the size of the data and where.
More Info: http://www.codeproject.com/KB/DLL/Win32APIHooking_Trouble.aspx
Unlike traditional hex editors which only display the raw hex bytes of a file, 010 Editor can also parse a file into a hierarchical structure using a Binary Template. The results of running a Binary Template are much easier to understand and edit than using just the raw hex bytes.
http://www.sweetscape.com/010editor/
Try to open it in a hex editor and analyse.

Free text search integrated with code coverage

Is there any tool which will allow me to perform a free text search over a system's code, but only over the code which was actually executed during a particular invocation?
To give a bit of background, when learning my way around a new system, I frequently find myself wanting to discover where some particular value came from, but searching the entire code base turns up far more matches than I can reasonably assess individually.
For what it's worth, I've wanted this in Perl and Java at one time or another, but I'd love to know if any languages have a system supporting this feature.
You can generally twist a code coverage tool's arm and get a report that shows the paths that have been executed during a given run. This report should show the code itself, with the first few columns marked up according to the coverage tool's particular notation on whether a given path was executed.
You might be able to use this straight up, or you might have to preprocess it and either remove the code that was not executed, or add a new notation on each line that tells whether it was executed (most tools will only show path information at control points):
So from a coverage tool you might get a report like this:
T- if(sometest)
{
x somecode;
}
else
{
- someother_code;
}
The notation T- indicates that the if statement only ever evaluated to true, and so only the first part of the code executed. The later notation 'x' indicates that this line was executed.
You should be able to form a regex that matches only when the first column contains a T, F, or x so you can capture all the control statements executed and lines executed.
Sometimes you'll only get coverage information at each control point, which then requires you to parse the C file and mark the execute lines yourself. Not as easy, but not impossible either.
Still, this sounds like an interesting question where the solution is probably more work than it's worth...
-Adam