How does autologging work in MLOps platforms like Comet or MLFlow? - comet

I was wondering how the implementation of logging is done where you just need to create an experiment object from comet_ml and it auto-detects and gives out all the statistics of the trained experiment. Is there some sort of logging system used?

Comet ML is not a supported MLflow flavor. You need to write your own custom flavor for it.
https://mlflow.org/docs/latest/models.html#built-in-model-flavors
https://mlflow.org/docs/latest/models.html#custom-flavors

Related

Good practices for app configuration storage?

We have a number of loosely coupled apps, some in PHP and some in Python.
It would be beneficial to have some centralized place where they could get both global and app-specific configuration information.
Something like, for Python:
conf=config_server.get_params(url='http://config_server/get/My_app/all', auth=my_auth_data)
and then ideally use parameters as potentially nested attributes, eg. conf.APP.URL, conf.GLOBAL.MAX_SALES
I was considering making my own config server app, but wasn't sure, what would be the pros and cons of such approach vs. eg. storing config in centralized database or any other multiple-site accessible mode.
Also, if I perhaps was missing some readily available tool with good support, which could do this (I had a look at Puppet and Ansible, but they seemed to be very evolved tools for doing so much more than this. I also looked at software recommnedation SE for this, but they have a number of such question unanswered already).
I think it would be a good idea for your configuration mechanism not to be hard-coded to obtain configuration data via a particular technology (such as file, web server or database), but rather be able to obtain configuration data from any of several different technologies. I illustrate this with the following pseudo-code examples:
cfg = getConfig("file.cfg"); # from a file
cfg = getConfig("file#file.cfg"); # also from a file
cfg = getConfig("url#http://config_server/file.cfg"); # from the specified URL
cfg = getConfig("exec#getConfigFromDB.py"); # from stdout of command
The parameter passed to getConfig() might be obtained from, say, a command-line option. The "exec#..." format is a flexible mechanism, but carries the potential danger of somebody specifying a malicious command to execute, for example, "exec#rm -rf /".
This approach means you can experiment with whatever you consider to be an ideal source-of-configuration-data technology and later, if you discover that technology to be inappropriate, it will be trivial to discard it and use a different source-of-configuration-data technology instead. Indeed, the decision for which source-of-configuration-data technology to use might vary from one use case/user to another.
I developed a C++ and Java configuration file parser (sorry, no Python or PHP implementations) called Config4*. If you look at chapters 2 (overview of syntax) and 3 (overview of API) of the Config4* Getting Started Guide, you will notice that it supports the kind of flexible approach I discuss in this answer (the "url#... format is not supported, but "exec#curl -sS ..." provides the same functionality). 99 percent of the time, I end up using configuration files, but I find it comforting to know that my applications can trivially switch to using a different-source-of-configuration-data technology whenever the need might arise.

Inspiration on how to build a great command line interface

I am in the process of building interactive front-ends to a
distributed application which to date has been used to run workloads
that had a batch-job like structures and needed no UI at all. The application is mostly written in Perl and C and runs on a mix of Unix and Windows machines, but I think this isn't relevant to the UI.
The first such frontend is going have a command-line user interface --
currently, I envision something similar to the CLIs of the Procurve
switches and Cisco routers that I have worked with.
Like modern network gear CLIs, commands are going to resemble
simple sentences, (i.e. show vlans ports 1-4) and the CLI will
have some implicit state, much in the way that Unix shells and
cmd.exe in Windows have environment variables and current working
directories. Moreover, I'd like to implement great tab completion that
is aware of the application's state as much as possible and I want to be able to do that with as
little application-specific code as possible.
The low-level functionality (terminal I/O) seems easy to implement on
top of GNU Readline or similar libraries, but that's only where the
real fun starts. So far I have looked at the Perl modules
Term::Shell
and
Term::ShellUI,
but I'm not convinced that I want to use either of them. I am still
considering rolling my own solution and at the moment I am primarily looking for
inspiration.
Can you recommend any application or library, regardless of
implementation language, that implements a good CLI from which I can
borrow ideas?
I suggest you take a look at the philosophy underlying Microsoft PowerShell. From the idea of piping typed objects between commands to the consistency of its commands and argument syntax, I think it can be a source of inspiration.
You could try having a look at libcli :
"Libcli provides a shared library for
including a Cisco-like command-line
interface into other software."
http://code.google.com/p/libcli/
BTW - I forgot to mention that it is GNU Lesser GPL and actually used by Cisco in some products.
As for your last sentence/question, I'm particularly fond of zsh completion and line editing (zle).

Does a Polymorphic engine have any other uses besides writing viruses?

Does a Polymorphic engine have any other uses besides writing viruses?
It can be used to protect your application against crackers/reverse engineers.
It can be used to tailor a particular piece of javascript for a particular user-agent / version.
Polymorphic engines are written to challenge scanners looking for digital prints.
The purpose is to change the print without modifying the behavior of the program, or data.
You could, for instance, apply a polymorphic engine on a copyrighted program, and then bypass a trial for pirating :)

Open alternatives to Windows Workflow

Pre-warning: There are some other questions similar to this but don't quite answer the question (these include: Alternatives to Windows Workflow Foundation?, Can anyone recommend a .Net open source alternative to Windows Workflow?)
We are developing a system that is an event based state machine, currently we are investigating windows workflow, our system needs to be low latency in its response to events from a multitude of sources (xmpp, http, sms, phone call, email etc etc) coming into the system, scalable and resilient and most importantly customisable. For a variety of reasons (and due diligence) I am looking for open workflow engines that support functions similar to Windows Workflow Foundation (and more - if possible), mainly (but it doesn't matter too much if there are engines that don't support some features):
Persistence of long running tasks, and resumption of tasks on external events
High performance, low latency
Ability to develop custom actions
The ability to specify workflows dynamically
Tracking and tracing
I am not constrained to platform or language, and I would love some help and tips from you guys so that I can start to investigate the engines more closely and any experiences you had with the engines.
Paul.
I invite you to examine Stateless further, as suggested in the answer to my SO question can-anyone-recommend-a-net-open-source-alternative-to-windows-workflow. to achieve the goal of a long running state machine is very simple in that you can store the current state of your state in a database and re-sync the state machine when needed. Consider the following code from the stateless site:
Stateless has been designed with
encapsulation within an ORM-ed domain
model in mind. Some ORMs place
requirements upon where mapped data
may be stored. To this end, the
StateMachine constructor can accept
function arguments that will be used
to read and write the state values:
var stateMachine = new StateMachine<State, Trigger>(
() => myState.Value,
s => myState.Value = s);
With very little effort you can persist your state, then retrieve that state easily later on.
In respect updating the workflow dynamically, if you configure a state machine such as
var stateMachine = new StateMachine<string, int>();
and maintain a separate file of states and triggers in XML, you can perform a configuration at runtime by looping through the string int value pairs.
"Java side":
Apache ODE (Orchestration Director Engine) executes business processes written following the WS-BPEL standard. It talks to web services, sending and receiving messages, handling data manipulation and error recovery as described by your process definition. It supports both long and short living process executions to orchestrate all the services that are part of your application.
http://ode.apache.org/
OSWorkflow can be considered a "low level" workflow implementation. Situations like "loops" and "conditions" that might be represented by a graphical icon in other workflow systems must be "coded" in OSWorkflow.
http://www.opensymphony.com/osworkflow/
Shark is an extendable workflow engine framework including a standard implementation completely based on WfMC specifications using XPDL (without any proprietary extensions !) as its native workflow process definition format and the WfMC "ToolAgents" API for serverside execution of system activitie
http://www.enhydra.org/workflow/shark/index.html
Python side:
http://bika.sourceforge.net/
http://www.vivtek.com/wftk/
I this will help you :-)
You might consider implementing your flow as an actual state machine. Tools like State Machine Compiler and Ragel can help with this. State machines, in many circumstances, are just what you need to implement insanely complex behavior that is testable, and rock-solid. I don't claim to be a Windows work flow expert, but from what I have seen, I question its superiority over coding your own state machine, either by hand or using a tool.
You might want to check out Simple State Machine.
If you feel like you want to have more control over things and want to roll your own it might be helpful to check out the Saga support that projects like NServiceBus and MassTransit use. Sagas look to be very similar to WF workflows but are POCO objects and I believe both projects just use NHibernate for Saga persistence.
I'm going to recommend you take a few hours to look at the book Open-Source ESBs in Action. "Orchestration" and "Choreography" are the key buzzwords to look at when dealing with "enterprise service busses." The systems for .NET are quite expensive (BizTalk is in the price range of a decent car, the price of Tibco is in the price range of a decent house).
Other links:
Open ESB project
Comparison of OpenESB and ServiceMix (both of which are the subject of the "In Action" book above.
Try Drools for JAVA, I personally have never tried it but I know several commercial applications are based on drools.
http://www.jboss.org/drools/
You could also upgrade to .NET 4.0 there are major improvements in the Workflow in the new framework. I know if I was writing a new workflow application I would jump to 4.0.
Good Luck
JBoss JBPM
Consider Workflow Engine, a lightweight all-in-one component that enables you to add custom executable workflows of any complexity to any .NET or Java software, be it your own creation or a third-party solution, with minimal changes to existing code. It supports custom actions and commands, has timers and supports parallel workflows. And there's a free version.
You can take a look at Imixs-Workflow, which is an event driven approach of a state machine based on bpmn 2.0. It specially focuses on human-centric long running tasks.

Anyone using Lisp for a MySQL-backended web app?

I keep hearing that Lisp is a really productive language, and I'm enjoying SICP. Still, I'm missing something useful that would let me replace PHP for server-side database interaction in web applications.
Is there something like PHP's PDO library for Lisp or Arc or Scheme or one of the dialects?
newLisp has support for mysql5 and if you look at the mysql5 function calls, you'll see that it's close to PDO.
Since nobody has mentioned it, you can try Postmodern, which is an interface to PostgreSQL. It aims for a tighter integration with PostgreSQL and so doesn't pretend to portability between databases.
I've put it together with hunchentoot and cl-who and built a pretty nice website.
newLISP - http://www.newlisp.org/ - has support for MySQL, but I haven't used it (newLISP).
If you're happy with SQL as part of your life, CL-SQL provides a mapping into CLOS objects. It appears to be more mature than Elephant.
I'm using it on my own website.
I've had good success with SBCL and CL-SQL. CL-SQL has a object mapping API, but I used the plain SQL API which simply returns lists and this worked well enough. And in the Clojure language, you interact with JDBC through a maps or structs {:col1 "a", :col2 "b"}, so a generated class library doesn't get you any simpler code, the language handles it nicely. In my experience, there is less cruft between lisp and sql than between more static languages and sql.
our Common Lisp ORM solution is http://common-lisp.net/project/cl-perec/
the underlying SQL lib is http://common-lisp.net/project/cl-rdbms/ (fully tested with PostgreSQL, has a toy SQlite backend and a somewhat tested Oracle backend)
we started out using CLSQL, but after some struggle we decided to roll our own.
these libs and PostgreSQL are used in a clustered web application developed for the Hungarian government for planning the budget of the municipalities. it has about 4000 users, 500 at peek time. a little more info is available at http://common-lisp.net/project/cl-dwim/
Cliki is a good resource for Common Lisp libraries:
http://www.cliki.net/database
There is a project named Elephant (http://common-lisp.net/project/elephant/index.html), which is an abstraction for object persistence in CL.
As long as you're switching your Webapp on Lisp, consider using persistence: you now have a constantly running Lisp image that holds everything about your application. I personnally used Elephant for that.
Elephant can use CL-SQL or BDB as it's backend, which means that you can use MySQL if you have one running. I found using SQLite really practical, though.
We use SBCL, UCW, CL-SQL and MySQL as our back-end for Paragent.com. It has worked very well for us. We also have a number of clients using UCW/CL-SQL/MySQL for custom sites we have built them through our consulting arm Bitfauna.