Difference between bug and failure [closed] - terminology

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I want to ask about the difference between bug and failure and error, i read that the error is mistake made by people, but i conflicted between the difference between the bug and failure. I can't know the difference exactly. Can any one help please and give simple snippet for code represents the difference.
Thanks a lot.

Bug is a programming error - not checking array bounds, ignoring error codes, multiple deletions, memory leaks, etc. fall under this general category. Errors like this require code changes to fix (there may be work-arounds that do not require code changes, though)
Failure is a system error - disconnection of storage, lack of network connectivity, and hardware failures are in this category. Fixing failures usually requires configuring other parts of the system, not the program itself.
User errors are mistakes made by users - entering values incorrectly or providing incomplete data are in this category. Errors like that are fixed by the user who uses the program without anyone else's involvement.

By my definition I would say
An error is about my behavior, or my acting. so I make errors.
A bug is the the result of my error in the program code.
The failure is the malfunction of my buggy software.
but others may interpret this differently.

A fault or Bug is a defect within the system (Somewhere hidden in the code and maybe never detected!).
An error is a deviation of the required operation of the system or subsystem. (The fault detected during execution but no harm).
A failure occurs when the system fails to perform its required function. (System crash)
An Error is a manifestation of a fault in a system, which could lead to system failure.
(Singhal/Shivaratri)
Example:
If you multiply x with 4 instead of 2 in your code, but there is no way to affect any functionalists or is not visible. This is a bug or fault.
If user can see it, let's say with having a wrong text as subject of email, then this is an error but still system worked and no harmful event happened.
But if your system withdraw the wrong money to user in a bank or your robot cut the head of the lady instead of cutting the cake for her then this is a failure :)

Instead of code snippets, I gave your examples below. I hope examples help you to understand the term better.
Bug is a term used among testers to address faults in software.
Error is a value or state or operation that is varies from expected value or state or operation. For example, programmer makes a mistake like missing a semi-colon, calling a wrong function name.
Result from system != Expected Result from system
Fault is a error brought into a system during design or implementation stage that is capable of causing system failure. Imagine some company X gives discount to their loyal customer. Loyal customer is someone who shops 10 times in a month. In software, programmer enter 20 times instead of 10. This is a mistake introduced by programmer called error. then it turned to fault. In tester language, it is a bug.
System failure is an inability of a system to do, what is required of the system. For example, if a user tries to sign up for account in a social networking site if they site fails to register the user. Then, that is system failure.
Technically,
Error -----> Fault -----> Failure
The root cause of any failure is error.

Related

When running the DAML sandbox an error occurs

The following error occurs when running the sandbox:
io.grpc.netty.NettyServerHandler onStreamError
WARNING: Stream Error
io.netty.handler.codec.http2.Http2Exception$HeaderListSizeException: Header size exceeded max allowed size (8192)
What could the cause of this be?
I have seen this error numerous times, and it is a consequence of having a transaction failure in a complex DAML model/transaction when running on the Sandbox. When you experience a transaction failure (fetch/exercise an inactive contract, lookupByKey returned a stale cid, head [], divide-by-zero, etc) the sandbox helpfully tries to provide transaction trace information in the error result.
This is normally fine for relatively simple models. With more complex models this trace can exceed the maximum header size producing the error you see. When this happens I have found the trace in the sandbox.log file, sometimes along with other errors that help explain what is going on.
The trace is an unformatted dump, so it can take a bit of effort to decode manually, but I have done it many times myself and the information I needed to identify the issue has always been there —— and to be honest, generally just knowing the choice I was exercising + the specific class of error is normally enough to point me in the right direction.
I believe there is some tooling being built to help with this sort of diagnosis; however, I don't know how advanced the work on that is.

Software Security Protection with Hardware Dongle [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have read all the existing discussions on piracy and hardware support, so this is not the same old question. I have a new twist on this old discussion. You can now purchase dongles for USB that allow you to put some of your important code into the dongle. If you have a complex algorithm and you put it into the dongle, someone would have to reverse engineer the contents of the dongle. If they tried to spoof the dongle, as was possible in the past, this would not work. All they can see is that data goes into a "black box" and result data comes out. It is no longer a matter of finding a jump true/false to bypass a license check in the source code.
Perhaps a mathematician with a lot of idle time on his hands could eventually reverse it, but that is an extreme level of interest! The other option is that the hardware dongle itself would need to be hacked. There are many protections against this built in, but this is probably the most effective approach.
So I want to take a scenario and see if I've missed something. I put the important part of my algorithm into the dongle to protect it. 6 doubles and 1 int go into the dongle, 1 double and 1 int are returned. This happens for thousands of data points. This is one of several functions of similar complexity. A hacker can see the rest of my assembly code (which I do as much as possible to obfuscate), but lets assume it is easily hacked. My question is, how hard is it to break into the dongle to access my assembly code in this proprietary hardware? Let's take as an example this companies product: http://www.senselock.com
I am not interested in lectures on how I'm inconveniencing customers and should open source my product, please. I am looking for a technical discussion on how a software/hardware engineer might approach extracting my assembly object from such a device. And I am not asking in order to hack one, but to know how much hassle I have as my discouragement against tampering. I know if there is a will, there is always a way. But at first glance it looks like it would take several thousand dollars worth of effort to bypass this scheme?
Given the response so far, I am adding some more specifics. The dongle has the following property, "Access to the chip is protected by PIN, and the maximum re-tries is pre-set by software developers. For instance, under a dictionary attack, once the number of re-tries exceed the pre-set value, the chip will trigger a self-locking mechanism". So to access the chip and thus the code inside it, you have to know the PIN, otherwise after let's say 10 tries you will be locked out. I personally can't see any way anyone could compromise this system. It doesn't matter what goes in or out, what matters is what runs inside the dongle ARM processor. Physical forced access would destroy the chip. Electrical access would require the PIN, or the chip locks up. How else could it be compromised?
I pretty much agree with your point of view that all dongles could be hacked, it just the matter of time and cost. If your encryption scheme is well-designed the EAL 5+ chip should be secure enough to prevent your software form malicious attacks.
And I think if you can READ the dongle it's probably means you already hacked the dongle, or it proofs there is a fatal vulnerability in the encryption scheme.
BTW, the link you give above is not work. Are you referring to this dongle? http://www.senselock.com/en/productinfor.php?nid=180&id=142&pid=
there are companies(such as break-ic.com) which have the list of mcu which they can break.
after breaking they give you only hex files.
in this case(mcu)every manufacturer has its own disassembler because of hardware architecture of every mcu core and there is no guarantee that your desire disassembler is exist!!!
so you must search for dongles which they have unbreakable mcu or their mcu has no disassembler.
or you can build you own dongle!!

To what extent should code try to explain fatal exceptions?

I suspect that all non-trivial software is likely to experience situations where it hits an external problem it cannot work around and thus needs to fail. This might be due to bad configuration, an external server being down, disk full, etc.
In these situations, especially if the software is running in non-interactive mode, I expect that all one can really do is log an error and wait for the admin to read the logs and fix the problem. If someone happens to interact with the software in the meantime, e.g. a request comes in to a server that failed to initialize properly, then perhaps an appropriate hint can be given to check the logs and maybe even the error can be echoed (depending on whether you can tell if they're a technical guy as opposed to a business user). For the moment though let's not think too hard about this part.
My question is, to what extent should the software be responsible for trying to explain the meaning of the fatal error? In general, how much competence/knowledge are you allowed to presume on administrators of the software, and how much should you include troubleshooting information and potential resolution steps when logging fatal errors? Of course if there's something that's unique to the runtime context this should definitely be logged; but lets assume your software needs to talk to Active Directory via LDAP and gets back an error "[LDAP: error code 49 - 80090308: LdapErr: DSID-0C090334, comment: AcceptSecurityContext error, data 525, vece]". Is it reasonable to assume that the maintainers will be able to Google the error code and work out what it means, or should the software try to parse the error code and log that this is caused by an incorrect user DN in the LDAP config?
I don't know if there is a definitive best-practices answer for this, so I'm keen to hear a variety of views.
The approach I tend to agree with is that you should explain as much as possible if the fatal error is caused by some code in your own responsibility (i.e. not third party). Otherwise if the error is caused "further down", for example at the database level, then the administrators should be passed up the error returned without adding much further information. So if the database server dies, then your connector with throw some exception, and you would log the error code in the exception.
The administrator or support staff should then have sufficient knowledge to resolve the issue with the provided information.
When you do provide too much details on errors which are not caused by your own code you run the risk of having error details NOT matching the cause of the actual error, especially if the error codes stop matching between versions.
Of course, there are exceptions. We have worked with open source libraries that were so poorly documented that we ended up writing wrappers around the libraries just to provide decent logging of what actually is going on.
Just my 2c
The answer, as for all broad questions, is "it depends."
If you're looking at a configuration error, then by all means you should try to explain what was wrong (in the logs). If it's an out-of-memory error, there's not much you can do -- and you may not even be able to write a log message.
One thing you said concerns me:
If someone happens to interact with
the software in the meantime, e.g. a
request comes in to a server that
failed to initialize properly, then
perhaps an appropriate hint can be
given to check the logs
If this is truly a fatal error, the server should not be running, and therefore any incoming request should fail with absolutely no warning or explanation.
You should at least provide the message from the exception and a stack trace so you can find out where in the code it occurred. If possible, you should also explain what you were attempting to do and what you think may have happened depending on the exception type.
I guess it depends on how much time you have before delivering the software to your customers.
Yes, it would be nice to parse the error and give a more explicit message but, in this day and age, Google is not always very far.
So unless, you have time to create the code to parse errors, I would leave them as is.
IMHO you can never provide too much information in these case.
In the real world it comes down to cost-benefit analysis, though. What's the impact of the error to you, your app, your business, etc. How much time is it worth spending on it.
In a business critical app my first point applies. Everything else is a sliding scale.
I think it depends on who is using the application.
If the application is used by tech savvy people then show more technical details, so they will be able to troubleshoot the problem if they want. I've had some users go to great lengths to solve issues. It can be very helpful, especially for issues that are specific to certain configurations.
If your user base is more of the average Joe then technical details will confuse them in most cases. You should show them a simple error message, and try to offer some solutions if possible.
You could also merge the two techniques. Show a simple error message by default and allow the user to view more detailed error information if they want.
You just don't want to overwhelm the user with too much information that they don't understand. It just frustrates and confuses them in the majority of cases.
There are two aspects I think all errors and exceptions should have:
1) Enough information in the error to help debug the problem. Stacktrace, class/method name, type of exception etc fall in this category.
2) A human understandable message, ideally clear enough for say Ops team or Sysadmins engineer to know who to call or forward that error message. Typically it is of the form "so and so module failed" or "network call failed" etc. Something that will come as close to you explaining the problem to customer, in non technical jargon.
Now with all the time constraints etc it may not be possible to have both messages programmed in. Then I would go out on a limb and say we should have the second type of error message. Remember, the sysadmin would probably be able to call you and since you helped write the code you can maybe pinpoint the error. But if the customer is on phone asking about the error, the sysadmin better be able to explain the possible cause :)
On a different note, all products need a clear exception/error handling mechanism decided at architecture level. And the exceptions NEED to adhere to that design. There are few things more frustrating than trying to debug an error based on a design only to find out a day later that its a one of a kind error message based on completely different design.
See https://meta.stackexchange.com/questions/3122/formatting-sandbox

How can I support the support department better?

With the best will in the world, whatever software you (and me) write will have some kind of defect in it.
What can I do, as a developer, to make things easier for the support department (first line, through to third line, and development) to diagnose, workaround and fix problems that the user encounters.
Notes
I'm expecting answers which are predominantly technical in nature, but I expect other answers to exist.
"Don't release bugs in your software" is a good answer, but I know that already.
Log as much detail about the environment in which you're executing as possible (probably on startup).
Give exceptions meaningful names and messages. They may only appear in a stack trace, but that's still incredibly helpful.
Allocate some time to writing tools for the support team. They will almost certainly have needs beyond either your users or the developers.
Sit with the support team for half a day to see what kind of thing they're having to do. Watch any repetitive tasks - they may not even consciously notice the repetition any more.
Meet up with the support team regularly - make sure they never resent you.
If you have at least a part of your application running on your server, make sure you monitor logs for errors.
When we first implemented daily script which greps for ERROR/Exception/FATAL and sends results per email, I was surprised how many issues (mostly tiny) we haven't noticed before.
This will help in a way, that you notice some problems yourself before they are reported to support team.
Technical features:
In the error dialogue for a desktop app, include a clickable button that opens up and email, and attaches the stacktrace, and log, including system properties.
On an error screen in a webapp, report a timestamp including nano-seconds and error code, pid, etc so server logs can be searched.
Allow log levels to be dynamically changed at runtime. Having to restart your server to do this is a pain.
Log as much detail about the environment in which you're executing as possible (probably on startup).
Non-technical:
Provide a known issues section in your documentation. If this is a web page, then this correspond to a triaged bug list from your bug tracker.
Depending on your audience, expose some kind of interface to your issue tracking.
Again, depending on audience, provide some forum for the users to help each other.
Usability solves problems before they are a problem. Sensible, non-scary error messages often allow a user to find the solution to their own problem.
Process:
watch your logs. For a server side product, regular reviews of logs will be a good early warning sign for impending trouble. Make sure support knows when you think there is trouble ahead.
allow time to write tools for the support department. These may start off as debugging tools for devs, become a window onto the internal state of the app for support, and even become power tools for future releases.
allow some time for devs to spend with the support team; listening to customers on a support call, go out on site, etc. Make sure that the devs are not allowed to promise anything. Debrief the dev after doing this - there maybe feature ideas there.
where appropriate provide user training. An impedence mismatch can cause the user to perceive problems with the software, rather than the user's mental model of the software.
Make sure your application can be deployed with automatic updates. One of the headaches of a support group is upgrading customers to the latest and greatest so that they can take advantage of bug fixes, new features, etc. If the upgrade process is seamless, stress can be relieved from the support group.
Similar to a combination of jamesh's answers, we do this for web apps
Supply a "report a bug" link so that users can report bugs even when they don't generate error screens.
That link opens up a small dialog which in turn submits via Ajax to a processor on the server.
The processor associates the submission to the script being reported on and its PID, so that we can find the right log files (we organize ours by script/pid), and then sends e-mail to our bug tracking system.
Provide a know issues document
Give training on the application so they know how it should work
Provide simple concise log lines that they will understand or create error codes with a corresponding document that describes the error
Some thoughts:
Do your best to validate user input immediately.
Check for errors or exceptions as early and as often as possible. It's easier to trace and fix a problem just after it occurs, before it generates "ricochet" effects.
Whenever possible, describe how to correct the problem in your error message. The user isn't interested in what went wrong, only how to continue working:
BAD: Floating-point exception in vogon.c, line 42
BETTER: Please enter a dollar amount greater than 0.
If you can't suggest a correction for the problem, tell the user what to do (or not to do) before calling tech support, such as: "Click Help->About to find the version/license number," or "Please leave this error message on the screen."
Talk to your support staff. Ask about common problems and pet peeves. Have them answer this question!
If you have a web site with a support section, provide a hyperlink or URL in the error message.
Indicate whether the error is due to a temporary or permanent condition, so the user will know whether to try again.
Put your cell phone number in every error message, and identify yourself as the developer.
Ok, the last item probably isn't practical, but wouldn't it encourage better coding practices?
Provide a mechanism for capturing what the user was doing when the problem happened, a logging or tracing capability that can help provide you and your colleagues with data (what exception was thrown, stack traces, program state, what the user had been doing, etc.) so that you can recreate the issue.
If you don't already incorporate developer automated testing in your product development, consider doing so.
Have a mindset for improving things. Whenever you fix something, ask:
How can I avoid a similar problem in the future?
Then try to find a way of solving that problem.

How to determine what log level to use? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
The log levels WARN, ERROR and FATAL are pretty clear. But when is something DEBUG, and when INFO?
I've seen some projects that are annoyingly verbose on the INFO level, but I've also seen code that favors the DEBUG level too much. In both cases, useful information is hidden in the noise.
What are the criteria for determining log levels?
I don't think there are any hard-and-fast rules; using the log4j-type levels, my 'rules of thumb' are something like:
FATAL: the app (or at the very least a thread) is about to die horribly. This is where the info explaining why that's happening goes.
ERROR: something that the app's doing that it shouldn't. This isn't a user error ('invalid search query'); it's an assertion failure, network problem, etc etc., probably one that is going to abort the current operation
WARN: something that's concerning but not causing the operation to abort; # of connections in the DB pool getting low, an unusual-but-expected timeout in an operation, etc. I often think of 'WARN' as something that's useful in aggregate; e.g. grep, group, and count them to get a picture of what's affecting the system health
INFO: Normal logging that's part of the normal operation of the app; diagnostic stuff so you can go back and say 'how often did this broad-level operation happen?', or 'how did the user's data get into this state?'
DEBUG: Off by default, able to be turned on for debugging specific unexpected problems. This is where you might log detailed information about key method parameters or other information that is useful for finding likely problems in specific 'problematic' areas of the code.
TRACE: "Seriously, WTF is going on here?!?! I need to log every single statement I execute to find this ##$#ing memory corruption bug before I go insane"
Not set in stone, but a rough idea of how I think of it.
Informally I use this sort of hierarchy,
DEBUG - actual trace values
INFO - Something just happened - nothing important, just a flag
WARN - everything's working, but something isn't quite what was expected
ERROR - something has happened that will need to be fixed, but we can carry on and do other (independent) activities
FATAL - a serious enough problem that we shouldn't even carry on
I'll generally release with INFO being logged, but only if I know that log files are actually reviewed (and size isn't an issue), otherwise it's WARN.
Think about who needs to use each level.
In my code I keep DEBUG reserved for a developer output, e.g. output that would only help a developer.
VERBOSE is used for a normal user when alot of info is needed.
INFO I use to normally show major events (e.g. sending a webpage, checking something important).
And FAIL and WARN are pretty self explanatory.
The convention in my team is to use debug if something is calculated in the message, whereas info is used for plain text. So in effect info will show you what's happening and debug will show the values of the things that are happening.
I tend to target INFO towards the user to give them messages that aren't even warnings. DEBUG tends to be for developer use where I output messages to help trace the flow through the code (with values of variables as well).
I also like another level of DEBUG (DEBUG2?) which gives absolute bucketloads of debug information such as hex dumps of all buffers and so on.
There's no need for a DEBUG2 level. That's what 'TRACE' is for. TRACE is intended to be the absolute lowest level of logging outputting every possible piece of information you might want to see.
To avoid a deluge of information, it is generally not recommended that you enable trace-level logging across an entire project. Instead use 'DEBUG' to find out general information about the bug and where it occurs (hence the name), and then enable TRACE only for that component if you still can't figure it out.