What is a Smalltalk "image"? - terminology

What is a Smalltalk "image"? Is it like serializing a Smalltalk run-time?

Most popular programming systems separate program code (in the form of class definitions, functions or procedures) from program state (such as objects or other forms of application data). They load the program code when an application is started, and any previous application state has to be recreated explicitly from configuration files or other data sources. Any settings the application programmer doesn't explicitly save, you have to set back up whenever you restart.
Many Smalltalk systems, however, do not differentiate between application data (objects) and code (classes). In fact, classes are objects themselves. Therefore most Smalltalk systems store the entire application state (including both Class and non-Class objects) in an image file. The image can then be loaded by the Smalltalk virtual machine to restore a Smalltalk-like system to a previous state.
http://en.wikipedia.org/wiki/Smalltalk#Image-based_persistence

The Smalltalk image is a very interesting beast. Look at it as a kind of immortality. Many current Smalltalk systems, Pharo, Squeak, VisualWorks among them, share a common ancestor, that is, a Smalltalk image from Xerox PARC. This common ancestor however is not some remote thing, but actually still alive in those modern systems. The modern variants were produced by sending messages to the objects in that image. Some of those messages actually morphed the current objects. Classes are full-blown objects, and creating new classes is done by sending messages to class objects. Some of the objects in a Smalltalk image may date back to 1972, when the first Smalltalk image bootstrapped! Smalltalk images never die, they just fade into something potentially fundamentally different. You should view your application building as not fundamentally different from creating a new Smalltalk version.

When the smalltalk VM starts, it loads a saved state of objects (yes: including open file streams, windows, threads and more) from the "image" into its memory and resumes execution where it left when the image was saved.
At any time during your work, you can "save an image" (aka: a snapshot of the current overall state) into an image file. You can keep multiple images on your disk. Useful if you work on different projects.
Images are often (but not in all smalltalk systems) portable across architectures; for example, a squeak image can be loaded into bot a windows and a mac (and even an android) squeak VM. Images are not portable across dialects, and sometimes not across versions within a dialect.
Images usually contain everything - even the debugger, compiler, editors, browsers etc. However, for deployment, it is sometimes useful to "strip" (i.e. remove unused stuff) from an image - either to hide secrets (;-) or to make it smaller (for embedded or mobile devices).
Most Smalltalks cannot live without an image, with the exception of Smalltalk/X and (I think) S#-Smalltalk (but I am on thin ice here...)
To save and transport source code, images are not useful - use either fileout in standard format or in xml or in any other transport format (there are many).
Images are also not useful for marshalling/unmarshalling; use xml, binarystorage, databases, glorb or any other serialization method for that.

It's serialising everything in the whole system, including all development work and all user data. Everything apart from the kernel of the runtime environment.
Smalltalk, like Java, runs on a Virtual Machine running symbolic bytecode, and it contains low-level things like the garbage collector. This makes Smalltalk very portable, and also very write-once-run-anywhere.
Unsurprisingly, this was the inspiration for Java. So the Smalltalk VM (StVM) is the equivalent of the Java Runtime Environment.
In Smalltalk, everything else is stored in RAM. The codebase, which is dynamically compiled on-the-fly for the StVM. All the object data that you've built up by running your vertical and horizontal end-user apps. All the customisation you've done to the windowing environment and its appearance. All the new code you've written. A song you have loaded onto the VM to play in a music player. Any other data, code or objects you're using or have loaded in.
It's all live in the PC's memory.
Periodically, you may want to save the current-state-play to disk. When you do that, you freeze the Smalltalk VM momentarily, and it copies everything into a single disk-file. That disk-file is called the image file, and by default it will have a .image suffix in most distributions on PCs (whether they are running Linux, MacOS, Windows or RiscOS).
It's a like the way you save your work-in-progress when you are in a word-processor or a spreadsheet on a typical PC. Except that this save includes the latest version of the spreadsheet code that the spreadsheet app itself is made out of.
The Smalltalk system does have other ways of securing your data. If you develop any software, or alter any of the codebase that the Smalltalk system is written in, it logs every change to disk in real-time.
You have the option of writing code, or loading an app, that will can save your source-code and it's associated data structures, to distributed source code repositories, or to repositories on your local disc. Or to relational databases. Or to object databases or the newly-fashionable NoSQL databases.
Most pre-written apps backup the data to disk(s) or database(s) on-the-fly.
The image is a save of the entire Smalltalk system, (apart from the Virtual Machine. The Virtual Machine is equivalent to the Java Runtime Environment. Everything else is stored in the image.
Write a new File System to access the underlying OS's discs? That's in the image. (and all the changes have also been logged to disk automatically by the Smalltalk system).
Enter a whole bunch of data into your Smalltalk image-based object database? That's in the image.
Want to do a factory-reset to your Smalltalk system? Simply go back to using the image file you received when you first installed Smalltalk. Want to save the image every hour on the hour, and then restore back to 4 hours ago? Just load the image file from four hours ago.
The image is a copy of everything that the Smalltalk system has in memory. Except for the small, unchanging, vital proportion of the system which is the Virtual Machine.

I recommend you read Pharo By Example. To quote from its first chapter,
"The current system image is a snapshot of a running Pharo system,
frozen in time. It consists of two files: an .image file, which contains the
state of all of the objects in the system (including classes and methods,
since they are objects too), and a .changes file, which contains a log of all
of the changes to the source code of the system. In Figure 1.1, these files
are called pharo.image and pharo.changes."
HTH

http://book.seaside.st/book/getting-started/pharo-squeak/what-is-image
All Smalltalk objects live in
something called an image. An image is
a snapshot of memory containing all
the objects at a given point in time.
Second hit on google.

Put simply, a Smalltalk image is an image of the Smalltalk environment which has been saved at a given point in time. When this image is reloaded into the Smalltalk runtime system, everything is as it was at the time the image was saved.
Images saved by one Smalltalk system cannot, in general, be loaded by a different Smalltalk system.
I find image-based development incredibly empowering. If I get interrupted I can save the image, and when I get back to it I'm right back where I was. Debuggers that were open are still open, waiting to continue. There's very little "got to figure out how to get back where I was" - it's more "OK, let's continue...".
Share and enjoy.

In almost every other language (apart from ABAP as far as I have been told by some senior SAP developers), you have a clear separation:
Code you are working on which defines logic
State of the program you are running
database, generally things you need as input for your code
In Smalltalk, all of this can be - note can - in the image.
In theory, you can deploy a Smalltalk application in a loaded image which brings all the data and logic and runs the application at startup.
In practice, as far as my experience goes, you tend not to do that for reasons.
If you stay inside the image, you have everything available at your disposal, for good or ill.
Classes are objects, methods are objects, so you can actually do things like add a
self halt
in a method, run some code which calls this method, change the method while it is being executed, recompile it and have the code continue.
You can also do wonderful things like passing method names as string to a method and then performing the argument without having this visible in code anywhere.
Both things are very good for learning and trying out. Not so good for production code or maintaining producion code.
The thing I was struggling with in the beginning was that for instance for UI creation, you create a window in the Smalltalk image which is then created "a second time" by the OS, with window handle and everything.
Of course, you can save the window with the Smalltalk image, it will also open up again (normally), but what happens internally is that there is a list of windows (i.e. all the UI components) which have been saved with the image in their Smalltalk state.
During Image Startup, there is a process that iterates over this list and asks the OS to recreate all of them.
In theory, you can do this for everything the OS offers:
file handles, resource handles, ports and so on
In practice, you probably dont want to do that.
The companies I worked at had nice beginner tutorials of code to run before saving your image in order not to get into trouble when restarting the next day.
Ideally, you could combine a Smalltalk image concept with peristence and just store all your objects in a real database.
I do not have the overview whether any Smalltalk dialect has done this, however.

Related

Run Perl in Browser with PerlTray

I am using perl tray from activestate and have a question. I am wanting to make some type of ui or way for a user to set "Settings" on my application. These settings can just be written / read from a text file that is stored on the users computer.
The part I am not understanding though is how to go about making a ui. The only thing i can think of is showing a local perl page that runs on their computer to write to the file. However, I'm not sure how i could get perl to run in the browser when only using perltray.
Any suggestions?
PerlTray is an odd duck. It has an implicit event loop that kicks in after you either fall off the end of your program or after your 1st call to exit(). This makes it incompatible with most other common GUI event loops or most mini-server techniques that operate in the same process & thread.
2 possibilities come to mind:
Most Likely you'll have success spawning a thread or process that creates a traditional perl GUI or a mini-server hosting your configuration web-app. I'd probably pick Tkx, but that's just my preference.
I have a suspicion that the Event Loop used by Win32::GUI may actually be compatible with the event loop in PerlTray, but some experimentation would be required to verify that. I generally avoid Win32::GUI because it's not platform independent, but if you're using PerlTray, you're tied to Windows anyway...

HTML5: accessing large structured local data

Summary:
Are there good HTML5/javascript options for selectively reading chunks of data (let's say to be eventually converted to JSON) from a large local file?
Problem I am trying to solve:
Some existing program locally and outputs a ton of data. I want to provide a browser-based interactive viewer that will allow folks to browse through these results. I have control over how the data is written out. I can write it all out in one big file, but since it's quite large, I can't just read the whole thing in memory. Hence, I am looking for some kind of indexed or db-like access to this from my webapp.
Thoughts on solutions:
1. Brute-force: HTML5 FileReader API has a nice slice() method for random access. So I could write out some kind of an index in the beginning of the file, use it to look up positions of other stored objects, and read them whenever they're needed. I figured I'd ask if there are already javascript libraries that do something like this (or better) before trying to implement this ugly thing.
2. HTML5 local database. Essentially, I am looking for an analog of HTML5 openDatabase() call that would open (a read-only) connection to a database based on a user-specified local file. From what I understand, there's no way to specify a file with a pre-loaded database. Furthermore, even if there was such a hack, it's not clear whether the local file format would be the same across browsers. I've seen the phonegap solution that populates the browser local database from SQL statements. I can do that too, but the data I am talking about is quite large (5-10GB): it will take a while to load, and such duplication seems rather pointless.
HTML5 does not sound like the appropriate answer for your needs. HTML5's focus is on the client side, and based on your description you're asking a lot out of the browsers, most likely more than they can handle.
I would instead recommend you look at a server-based solution to deliver the desired goal/results to the client view, something like Splunk would be a good product to consider.

What are logging libraries for?

This may be a stupid question, as most of my programming consists of one-man scientific computing research prototypes and developing relatively low-level libraries. I've never programmed in the large in an enterprise environment before. I've always wondered, what are the main things that logging libraries make substantially easier than just using good old fashioned print statements or file output, simple programming logic and a few global variables to determine how verbosely things get logged? How do you know when a few print statements or some basic file output ain't gonna cut it and you need a real logging library?
Logging helps debug problems especially when you move to production and problems occur on people's machines you can't control. Best laid plans never survive contact with the enemy, and logging helps you track how that battle went when faced with real world data.
Off the shel logging libraries are easy to plug in and play in less than 5 minutes.
Log libraries allow for various levels of logging per statement (FATAL, ERROR, WARN, INFO, DEBUG, etc).
And you can turn up or down logging to get more of less information at runtime.
Highly threaded systems help sort out what thread was doing what. Log libraries can log information about threads, timestamps, that ordinary print statements can't.
Most allow you to turn on only portions of the logging to get more detail. So one system can log debug information, and another can log only fatal errors.
Logging libraries allow you to configure logging through an external file so it's easy to turn on or off in production without having to recompile, deploy, etc.
3rd party libraries usually log so you can control them just like the other portions of your system.
Most libraries allow you to log portions or all of your statements to one or many files based on criteria. So you can log to both the console AND a log file.
Log libraries allow you to rotate logs so it will keep several log files based on many different criteria. Say after the log gets 20MB rotate to another file, and keep 10 log files around so that log data is always 100MB.
Some log statements can be compiled in or out (language dependent).
Log libraries can be extended to add new features.
You'll want to start using a logging libraries when you start wanting some of these features. If you find yourself changing your program to get some of these features you might want to look into a good log library. They are easy to learn, setup, and use and ubiquitous.
There are used in environments where the requirements for logging may change, but the cost of changing or deploying a new executable are high. (Even when you have the source code, adding a one line logging change to a program can be infeasible because of internal bureaucracy.)
The logging libraries provide a framework that the program will use to emit a wide variety of messages. These can be described by source (e.g. the logger object it is first sent to, often corresponding to the class the event has occurred in), severity, etc.
During runtime the actual delivery of the messaages is controlled using an "easily" edited config file. For normal situations most messages may be obscured altogether. But if the situation changes, it is a simpler fix to enable more messages, without needing to deploy a new program.
The above describes the ideal logging framework as I understand the intention; in practice I have used them in Java and Python and in neither case have I found them worth the added complexity. :-(
They're for logging things.
Or more seriously, for saving you having to write it yourself, giving you flexible options on where logs are store (database, event log, text file, CSV, sent to a remote web service, delivered by pixies on a velvet cushion) and on what is logged at runtime, rather than having to redefine a global variable and then recompile.
If you're only writing for yourself then it's unlikely you need one, and it may introduce an external dependency you don't want, but once your libraries start to be used by others then having a logging framework in place may well help your users, and you, track down problems.
I know that a logging library is useful when I have more than one subsystem with "verbose logging," but where I only want to see that verbose data from one of them.
Certainly this can be achieved by having a global log level per subsystem, but for me it's easier to use a "system" of some sort for that.
I generally have a 2D logging environment too; "Info/Warning/Error" (etc) on one axis and "AI/UI/Simulation/Networking" (etc) on the other. With this I can specify the logging level that I care about seeing for each subsystem easily. It's not actually that complicated once it's in place, indeed it's a lot cleaner than having if my_logging_level == DEBUG then print("An error occurred"); Plus, the logging system can stuff file/line info into the messages, and then getting totally fancy you can redirect them to multiple targets pretty easily (file, TTY, debugger, network socket...).

What are HTML applications?

I recenty stumbled upon some files described as "HTML Applications" on my Win XP machine.
What are they?
Who would ever use them? Why do I have like 2 or 3 of them on my PC?
How do they generally work? I mean hey - HTML is for adding formatting to text - HTML Applications? What the? Microsoft?
HTAs are good for things like VB scripts that you want an interface for other than MsgBox or a console window.
Since it's HTML, you can use buttons, text areas, check boxes, etc to show information to the user and get input from them, and use CSS to style it all. Since HTAs run on the local machine, you have access to everything you can do with VBScript for computation and file access, WMI for system management, program automation with COM objects, data access with ADO, and so on.
I once wrote an HTA that installs, updates, and compares Word templates on a user's machine from a common folder. The user can see their template folder next to the common folder to know if they are up to date, and hit the Update button if not.
Another one manages and verifies the installation of a program on a user's computer, copying over the exe if necessary, making sure registry entries are set correctly, putting a shortcut on the desktop, letting the user test and see the results of the installation, and so on. It also logs all of this info to a common place for me to check on.
One of my biggest HTA projects was a Project Manager system. The interface showed me all of the Excel, Word or Access projects I had going on. It would open the selected project in its particular environment, and showed me all of the pieces of it. It allowed me to import and export code modules from a common library using VBE automation (the Visual Basic Editor COM interface).
I'm about to put one together to show current and "dead" printer drivers on a user's machine. With me coaching them over the phone, they will run the HTA which will list all of the installed printers. They will put a check mark next to the ones they want to keep, then hit a button to delete all of the others. Fairly easy for them, and saves me from going to each and every PC to fix this.
Many of these kinds of things only make sense in a Windows environment, but you can write some pretty general purpose stuff with it too. Anything you can express in VBScript or JScript (JavaScript) and want an HTML/CSS front on is a good candidate for an HTA. I also even wrote a basic network chat system in it at one point.
There are lots of little HTAs around for converting data from one format to another, say converting comma separated data to columnar, or adding or removing various kinds of formatting like quote-printable escape codes, converting hex formatted text into plain text, and on and on. Copy text into one input text area, check a few options and press the Go button, then copy the converted data from the output text area. One I wrote was an SQL formatter. It would take SQL code and wrap it up as either a VB or Delphi string, and also
go from wrapped back to plain SQL code, with basic indenting and "pretty printing" to clean it up.
I don't do as much with HTAs as I used to, but still think they are a pretty cool technology for the kinds of jobs that fit in that niche.
See here for Introduction to HTML Applications (HTAs).

Testing and mocking with Flex

I am developing a "dumb" front-end, it's an AIR application that interacts with a "smart" LiveCycle server. There are currently about 20 request & response pairs for the application. For many reasons (testing, developing outside the corporate network, etc), we have several XML files of fake data, and if a certain configuration flag is set, the files are loaded, a specific file is parsed and used to create a mock response. Each XML file is a set of responses for different situation, all internally consistent. We currently have about 10 XML files, each corresponding to different situation we can run into. This is probably going to grow to 30-50 XML files.
The current system was developed by me during one of those 90-hour-week release cycles, when we were under duress because LiveCycle was down again and we had a deadline to meet. Most of the minor crap has been cleaned up.
The fake data is in an object called FakeData, with properties like customerType1:XML, customerType2:XML, overdueCustomer1:XML, etc. Then in the FakeData constructor, all of the properties are set like this:
customerType1:XML = FileUtil.loadXML(File.applicationDirectory.resolvePath("fakeData/customerType1.xml");
And whenever you need some fake data (this happens in special FakeDelegates that extend the real LiveCycle Delegates), you get it from an instance of FakeData.
This is awful, for many reasons, but it works. One embarrassing part is that every time you create an instance of FakeData, it reloads all the XML files.
I'm trying to figure out if there's a design pattern that is not Singleton that can handle this more elegantly. The constraints are:
No global instances can be required (currently, all the code dealing with the fake data, including the fake delegates, is pulled out of production builds without any side-effects, and it needs to stay that way). This puts the Factory pattern out of the running.
It can handle multiple objects using the XML data without performance issues.
The XML files are read centrally so that the other code doesn't have to know where the XML files are, and so some preprocessing can be done (like creating a map of certain tag values and the associated XML file).
Design patterns, or other architecture suggestions, would be greatly appreciated.
Take a look at ASMock which was developed by a good friend of mine (and a member here Richard Szalay) and is based on .nets Rhino mocks. We've used it in several production environments now so i can vouch for it's stability.
should be able to get rid of any fake tests (more like integration tests) by using the mock object instead.
Wouldn't it make more sense to do traditional mocking with a mocking framework? Depending on your implementation, it might be possible to set up the Expects by reading the fake-data XML files.
Here is a Google Code project that offers mocking for ActionScript.