Is it bad practice to use the system() function when library functions could be used instead? Why? - language-agnostic

Say there is some functionality needed for an application under development which could be achieved by making a system call to either a command line program or utilizing a library. Assuming efficiency is not an issue, is it bad practice to simply make a system call to a program instead of utilizing a library? What are the disadvantages of doing this?
To make things more concrete, an example of this scenario would be an application which needs to download a file from a web server, either the cURL program or the libcURL library could be used for this.

Unless you are writing code for only one OS, there is no way of knowing if your system call will even work. What happens when there is a system update or an OS upgrade?
Never use a system call if there is a library to do the same function.

I prefer libraries because of the dependency issue, namely the executable might not be there when you call it, but the library will be (assuming external library references get taken care of when the process starts on your platform). In other words, using libraries would seem to guarantee a more stable, predictable outcome in more environments than system calls would.

There are several factors to take into account. One key one is the reliability of whether the external program will be present on all systems where your software is installed. If there is a possibility that it will be missing, then maybe it is better to do it inside your program.
Weighing against that, you might consider that the extra code loaded into your program is prohibitive - you don't need the code bloat for such a seldom-used part of your application.
The system() function is convenient, but dangerous, not least because it invokes a shell, usually. You may be better off calling the program more directly - on Unix, via the fork() and exec() system calls. [Note that a system call is very different from calling the system() function, incidentally!] OTOH, you may need to worry about ensuring all open file descriptors in your program are closed - especially if your program is some sort of daemon running on behalf of other users; that is less of a problem if your are not using special privileges, but it is still a good idea not to give the invoked program access to anything you did not intend. You may need to look at the fcntl() system call and the FD_CLOEXEC flag.
Generally, it is easier to keep control of things if you build the functionality into your program, but it is not a trivial decision.

Security is one concern. A malicious cURL could cause havoc in your program. It depends if this is a personal program where coding speed is your main focus, or a commercial application where things like security play a factor.

System calls are much harder to make safely.
All sorts of funny characters need to be correctly encoded to pass arguments in, and the types of encoding may vary by platform or even version of the command. So making a system call that contains any user data at all requires a lot of sanity-checking and it's easy to make a mistake.

Yeah, as mentioned above, keep in mind the difference between system calls (like fcntl() and open()) and system() calls. :)
In the early stages of prototyping a c program, I often make external calls to programs like grep and sed for manipulation of files using popen(). It's not safe, it's not secure, and it's certainly not portable. But it can allow you to get going quickly. That's valuable to me. It lets me focus on the really important core of the program, usually the reason I used c in the first place.
In high level languages, you'd better have a pretty good reason. :)

Instead of doing either, I'd Unix it up and build a script framework around your app, using the command line arguments and stdin.

Other's have mentioned good points (reliability, security, safety, portability, etc) - but I'll throw out another. Performance. Generally it is many times faster to call a library function or even spawn a new thread then it is to start an entire new process (and then you still have to correctly check/verify it's execution and parse it's output!)

Related

The use of packages to parse command arguments employing options/switches?

I have a couple questions about adding options/switches (with and without parameters) to procedures/commands. I see that tcllib has cmdline and Ashok Nadkarni's book on Tcl recommends the parse_args package and states that using Tcl to handle the arguments is much slower than this package using C. The Nov. 2016 paper on parse_args states that Tcl script methods are or can be 50 times slower.
Are Tcl methods really signicantly slower? Is there some minimum threshold number of options to be reached before using a package?
Is there any reason to use parse_args (not in tcllib) over cmdline (in tcllib)?
Can both be easily included in a starkit?
Is this included in 8.7a now? (I'd like to use 8.7a but I'm using Manjaro Linux and am afraid that adding it outside the package manager will cause issues that I won't know how to resolve or even just "undo").
Thank you for considering my questions.
Are Tcl methods really signicantly slower? Is there some minimum threshold number of options to be reached before using a package?
Potentially. Procedures have overhead to do with managing the stack frame and so on, and code implemented in C can avoid a number of overheads due to the way values are managed in current Tcl implementations. The difference is much more profound for numeric code than for string-based code, as the cost of boxing and unboxing numeric values is quite significant (strings are always boxed in all languages).
As for which is the one to use, it really depends on the details as you are trading off flexibility for speed. I've never known it be a problem for command line parsing.
(If you ask me, fifty options isn't really that many, except that it's quite a lot to pass on an actual command line. It might be easier to design a configuration file format — perhaps a simple Tcl script! — and then to just pass the name of that in as the actual argument.)
Is there any reason to use parse_args (not in tcllib) over cmdline (in tcllib)?
Performance? Details of how you describe things to the parser?
Can both be easily included in a starkit?
As long as any C code is built with Tcl stubs enabled (typically not much more than define USE_TCL_STUBS and link against the stub library) then it can go in a starkit as a loadable library. Using the stubbed build means that the compiled code doesn't assume exactly which version of the Tcl library is present or what its path is; those are assumptions that are usually wrong with a starkit.
Tcl-implemented packages can always go in a starkit. Hybrid packages need a little care for their C parts, but are otherwise pretty easy.
Many packages either always build in stubbed mode or have a build configuration option to do so.
Is this included in 8.7a now? (I'd like to use 8.7a but I'm using Manjaro Linux and am afraid that adding it outside the package manager will cause issues that I won't know how to resolve or even just "undo").
We think we're about a month from the feature freeze for 8.7, and builds seem stable in automated testing so the beta phase will probably be fairly short. The list of what's in can be found here (filter for 8.7 and Final). However, bear in mind that we tend to feel that if code can be done in an extension then there's usually no desperate need for it to be in Tcl itself.

Sending functions rather than data

Nowadays, we always think like "send your data to a server, it computes it for you, then send you back the response".
But imagine something else : i want my client to compute the data itself.
The question is : is there something like a universal protocol to send actions rather than data through http ? So that the server can send the action to the client, whatever system it uses. If it does not exist, what are the technical difficulties you can face creating this kind of system ?
I'm talking about "static" actions, like mathematical functions for example.
You're unfortunately going to run into a problem pretty quick because, technically speaking, a universal language is impossible. Systems are going to have different architecture, different languages available, and different storage means. I believe what you intend (correct me if I'm wrong) is a "widespread" protocol. One way or another, you're going to have to drill down based on your personal use-case.
For a widespread example, you could keep a set of JavaScript files with functions server-side, and refer a web client to the one they need to run it by loading a javascript file during some event. Pass the location of the file and the function name, load it using the link above, then call the JavaScript function by name to run it. I could see this being an admitedly somewhat roundabout solution. This also may work in Java due to its built in JavaScript engine, although I haven't tested it.
Beyond that, I am unaware of anything particularly widespread. Most applications limit what they accept as instructions quite strictly to prevent security breaches (Imagine a SQL Injection that can run free on a client's machine). In fact, JavaScript limits itself quite severely, perhaps most notably in regards to local file reading.
Hopefully this helps with your ideas. Let me know in a comment if you have any questions/issues about what I've said.

Advantages of a VM

The majority of languages I have come across utilise a VM, or virtual machine. Languages such as Java (the JVM), Python, Ruby, PHP (the HHVM), etc.
Then there are languages such as C, C++, Haskell, etc. which compile directly to native.
My question is, what is the advantage of using a VM (outside of OS-independence)? Isn't using a VM just creating an extra interpretation step, by going [source code -> bytecode -> native] instead of just [source code -> native]?
Why use a VM when you can compile directly?
EDIT
My understanding is that Python, Ruby, et al. use something akin to a VM, if not exactly fitting under such a definition, where scripts are compiled to an intermediate representation (for Python, e.g. .pyc files).
EDIT 2
Yep. Looked it up. Python, Ruby and PHP all use intermediate representations, but are simply not stored in seperate files but executed by the VM directly. See question : Java "Virtual Machine" vs. Python "Interpreter" parlance?
" Even though Python uses a virtual machine under the covers, from a
user's perspective, one can ignore this detail most of the time. "
An advantage of VM is that, it is much easier to modify some parts of the code on runtime, which is called Reflection. It brings some elegance capabilities. For example, you can ask the user which function/class he want to call, and call the function/class by its STRING name. In Java programs (and maybe some other VM-based languages) users can add additional library to the program in runtime, and the library can be run immediately!
Another advantage is the ability to use advanced garbage collection, because the bytecode's structure is easier to analyze.
Let me note that a virtual machine does not always interpret the code, and therefore it is not always slower than machine code. For example, Java has a component named hotspot which searches for code blocks that are frequently called, and replaces their bytecode with native code (machine code). For instance, if a for loop is called for, say , 100+ times, hotspot converts it to machine-code, so that in the next calls it will run natively! This insures that just the bottlenecks of your code are running natively, while the rest part allows for the above advantages.
P.S. It is not impossible to compile the code directly to native code. Many VM-based languages have compiler versions (e.g. there is a compiler for PHP: http://www.phpcompiler.org). However, remember that you are disabling some of the above features by compiling the whole program to native code.
P.S. The [source-code -> byte-code] part is not a problem, it is compiled once and does not relate to execution time. I presumed you are asking why they do not execute the machine code while it is possible.
Python, Ruby, and PhP do not utilize VMs. They are, however, interpreted.
To answer your actual question: Java utilizes a VM in order to add some distance between the operating system/hardware and the code being executed. The goal there was security and hardiness (hardiness meaning there was a lower likelihood of code having an averse effect on other processes in the system.)
All the languages you listed are interpreted so I think what you may have actually meant to ask was the difference between interpreted and compiled languages. Interpreted languages are cross-platform. That is the biggest, and main, advantage. You need not compile them for each different set of hardware or operating system they operate on, and instead they will simply work everywhere.
The advantage of a compiled language, traditionally, is speed and efficiency.
Because a VM allows for the same set of instructions to be run on my different operating systems (provided they have the interperetor)
Let's take Java as an example. Java gets compiled into bytecode, which is basically a set of operations for a computer to follow. However, not all processors in computers understand the same set of instructions the same way - meaning, what one set of native instruction means on computer A could be something different on computer B.
As a result, a VM is run, with one specific to each computer. This way, the Java bytecode that is written is standardized, and only the interpreter has to work to convert it to machine language.
OS independence is a big part of it but you also get abstractions from other things like CPUs... the same Java code can execute on ARM, x86, whatever without modification so long as there is a JVM in place.

How To Distribute a Project Built In a Interpreted Language?

I've started a project(developer text editor) in a interpreted language(Tcl/Tk) and another with Perl(both are open-source), but with some time, when it gets in a Beta version, I will need to distribute it for the users(developers of course), but I want to know some things about this:
It's possible to compile it to a executable?
How?
Can I compile for other platforms?
Or in this case it's better to use a compiled language or a interpreted?
Is usual things like this?
In the users machine, he will need to have Tcl/Tk or Perl?
Both Tcl and Perl can be compiled into executables. For windows, there's perl2exe and perlcc for systems running UNIX style operating systems. As for Tcl, there is freewrap and starpacks.
If you're just doing this for the benefit of a single executable, and eliminating the need for installing Perl and other dependencies, then there's no real reason you can't do this. It's quite a nice method for testing your application without having to constantly compile, though defeats the point of using an interpreted language in the first place.
Also take a look at The Simplest Steps to Converting TCL TK to a Stand Alone Application, this page is also useful, How can I compile Tcl type scripts into binary code
The usual and common way for such scripts is to distribute the source. A binary would only work on some very specific systems but Tcl/Tk/Perl runs on so many systems, so that would be a really big restriction for no real reason. It also helps other developers more to reuse your scripts in some good way. In most cases, even when somebody could execute your binary, it wouldn't be of much help without the source.

What do you think of developing for the command line first?

What are your opinions on developing for the command line first, then adding a GUI on after the fact by simply calling the command line methods?
eg.
W:\ todo AddTask "meeting with John, re: login peer review" "John's office" "2008-08-22" "14:00"
loads todo.exe and calls a function called AddTask that does some validation and throws the meeting in a database.
Eventually you add in a screen for this:
============================================================
Event: [meeting with John, re: login peer review]
Location: [John's office]
Date: [Fri. Aug. 22, 2008]
Time: [ 2:00 PM]
[Clear] [Submit]
============================================================
When you click submit, it calls the same AddTask function.
Is this considered:
a good way to code
just for the newbies
horrendous!.
Addendum:
I'm noticing a trend here for "shared library called by both the GUI and CLI executables." Is there some compelling reason why they would have to be separated, other than maybe the size of the binaries themselves?
Why not just call the same executable in different ways:
"todo /G" when you want the full-on graphical interface
"todo /I" for an interactive prompt within todo.exe (scripting, etc)
plain old "todo <function>" when you just want to do one thing and be done with it.
Addendum 2:
It was mentioned that "the way [I've] described things, you [would] need to spawn an executable every time the GUI needs to do something."
Again, this wasn't my intent. When I mentioned that the example GUI called "the same AddTask function," I didn't mean the GUI called the command line program each time. I agree that would be totally nasty. I had intended (see first addendum) that this all be held in a single executable, since it was a tiny example, but I don't think my phrasing necessarily precluded a shared library.
Also, I'd like to thank all of you for your input. This is something that keeps popping back in my mind and I appreciate the wisdom of your experience.
I would go with building a library with a command line application that links to it. Afterwards, you can create a GUI that links to the same library. Calling a command line from a GUI spawns external processes for each command and is more disruptive to the OS.
Also, with a library you can easily do unit tests for the functionality.
But even as long as your functional code is separate from your command line interpreter, then you can just re-use the source for a GUI without having the two kinds at once to perform an operation.
Put the shared functionality in a library, then write a command-line and a GUI front-end for it. That way your layer transition isn't tied to the command-line.
(Also, this way adds another security concern: shouldn't the GUI first have to make sure it's the RIGHT todo.exe that is being called?)
Joel wrote an article contrasting this ("unix-style") development to the GUI first ("Windows-style") method a few years back. He called it Biculturalism.
I think on Windows it will become normal (if it hasn't already) to wrap your logic into .NET assemblies, which you can then access from both a GUI and a PowerShell provider. That way you get the best of both worlds.
My technique for programming backend functionality first without having the need for an explicit UI (especially when the UI isn't my job yet, e.g., I'm desigining a web application that is still in the design phase) is to write unit tests.
That way I don't even need to write a console application to mock the output of my backend code -- it's all in the tests, and unlike your console app I don't have to throw the code for the tests away because they still are useful later.
I think it depends on what type of application you are developing. Designing for the command line puts you on the fast track to what Alan Cooper refers to as "Implementation Model" in The Inmates are Running the Asylum. The result is a user interface that is unintuitive and difficult to use.
37signals also advocates designing your user interface first in Getting Real. Remember, for all intents and purposes, in the majority of applications, the user interface is the program. The back end code is just there to support it.
It's probably better to start with a command line first to make sure you have the functionality correct. If your main users can't (or won't) use the command line then you can add a GUI on top of your work.
This will make your app better suited for scripting as well as limiting the amount of upfront Bikeshedding so you can get to the actual solution faster.
If you plan to keep your command-line version of your app then I don't see a problem with doing it this way - it's not time wasted. You'll still end up coding the main functionality of your app for the command-line and so you'll have a large chunk of the work done.
I don't see working this way as being a barrier to a nice UI - you've still got the time to add one and make is usable etc.
I guess this way of working would only really work if you intend for your finished app to have both command-line and GUI variants. It's easy enough to mock a UI and build your functionality into that and then beautify the UI later.
Agree with Stu: your base functionality should be in a library that is called from the command-line and GUI code. Calling the executable from the UI is unnecessary overhead at runtime.
#jcarrascal
I don't see why this has to make the GUI "bad?"
My thought would be that it would force you to think about what the "business" logic actually needs to accomplish, without worrying too much about things being pretty. Once you know what it should/can do, you can build your interface around that in whatever way makes the most sense.
Side note: Not to start a separate topic, but what is the preferred way to address answers to/comments on your questions? I considered both this, and editing the question itself.
I did exactly this on one tool I wrote, and it worked great. The end result is a scriptable tool that can also be used via a GUI.
I do agree with the sentiment that you should ensure the GUI is easy and intuitive to use, so it might be wise to even develop both at the same time... a little command line feature followed by a GUI wrapper to ensure you are doing things intuitively.
If you are true to implementing both equally, the result is an app that can be used in an automated manner, which I think is very powerful for power users.
I usually start with a class library and a separate, really crappy and basic GUI. As the Command Line involves parsing the Command Line, I feel like i'm adding a lot of unneccessary overhead.
As a Bonus, this gives an MVC-like approach, as all the "real" code is in a Class Library. Of course, at a later stage, Refactoring the library together with a real GUI into one EXE is also an option.
If you do your development right, then it should be relatively easy to switch to a GUI later on in the project. The problem is that it's kinda difficult to get it right.
Kinda depends on your goal for the program, but yeah i do this from time to time - it's quicker to code, easier to debug, and easier to write quick and dirty test cases for. And so long as i structure my code properly, i can go back and tack on a GUI later without too much work.
To those suggesting that this technique will result in horrible, unusable UIs: You're right. Writing a command-line utility is a terrible way to design a GUI. Take note, everyone out there thinking of writing a UI that isn't a CLUI - don't prototype it as a CLUI.
But, if you're writing new code that does not itself depend on a UI, then go for it.
A better approach might be to develop the logic as a lib with a well defined API and, at the dev stage, no interface (or a hard coded interface) then you can wright the CLI or GUI later
I would not do this for a couple of reasons.
Design:
A GUI and a CLI are two different interfaces used to access an underlying implementation. They are generally used for different purposes (GUI is for a live user, CLI is usually accessed by scripting) and can often have different requirements. Coupling the two together is not a wise choice and is bound to cause you trouble down the road.
Performance:
The way you've described things, you need to spawn an executable every time the GUI needs to d o something. This is just plain ugly.
The right way to do this is to put the implementation in a library that's called by both the CLI and the GUI.
John Gruber had a good post about the concept of adding a GUI to a program not designed for one: Ronco Spray-On Usability
Summary: It doesn't work. If usability isn't designed into an application from the beginning, adding it later is more work than anyone is willing to do.
#Maudite
The command-line app will check params up front and the GUI won't - but they'll still be checking the same params and inputting them into some generic worker functions.
Still the same goal. I don't see the command-line version affecting the quality of the GUI one.
Do a program that you expose as a web-service. then do the gui and command line to call the same web service. This approach also allows you to make a web-gui, and also to provide the functionality as SaaS to extranet partners, and/or to better secure the business logic.
This also allows your program to more easily participate in a SOA environement.
For the web-service, don't go overboard. do yaml or xml-rpc. Keep it simple.
In addition to what Stu said, having a shared library will allow you to use it from web applications as well. Or even from an IDE plugin.
There are several reasons why doing it this way is not a good idea. A lot of them have been mentioned, so I'll just stick with one specific point.
Command-line tools are usually not interactive at all, while GUI's are. This is a fundamental difference. This is for example painful for long-running tasks.
Your command-line tool will at best print out some kind of progress information - newlines, a textual progress bar, a bunch of output, ... Any kind of error it can only output to the console.
Now you want to slap a GUI on top of that, what do you do ? Parse the output of your long-running command line tool ? Scan for WARNING and ERROR in that output to throw up a dialog box ?
At best, most UI's built this way throw up a pulsating busy bar for as long as the command runs, then show you a success or failure dialog when the command exits. Sadly, this is how a lot of UNIX GUI programs are thrown together, making it a terrible user experience.
Most repliers here are correct in saying that you should probably abstract the actual functionality of your program into a library, then write a command-line interface and the GUI at the same time for it. All your business logic should be in your library, and either UI (yes, a command line is a UI) should only do whatever is necessary to interface between your business logic and your UI.
A command line is too poor a UI to make sure you develop your library good enough for GUI use later. You should start with both from the get-go, or start with the GUI programming. It's easy to add a command line interface to a library developed for a GUI, but it's a lot harder the other way around, precisely because of all the interactive features the GUI will need (reporting, progress, error dialogs, i18n, ...)
Command line tools generate less events then GUI apps and usually check all the params before starting. This will limit your gui because for a gui, it could make more sense to ask for the params as your program works or afterwards.
If you don't care about the GUI then don't worry about it. If the end result will be a gui, make the gui first, then do the command line version. Or you could work on both at the same time.
--Massive edit--
After spending some time on my current project, I feel as though I have come full circle from my previous answer. I think it is better to do the command line first and then wrap a gui on it. If you need to, I think you can make a great gui afterwards. By doing the command line first, you get all of the arguments down first so there is no surprises (until the requirements change) when you are doing the UI/UX.
That is exactly one of my most important realizations about coding and I wish more people would take such approach.
Just one minor clarification: The GUI should not be a wrapper around the command line. Instead one should be able to drive the core of the program from either a GUI or a command line. At least at the beginning and just basic operations.
When is this a great idea?
When you want to make sure that your domain implementation is independent of the GUI framework. You want to code around the framework not into the framework
When is this a bad idea?
When you are sure your framework will never die