Get atomic coordinates from specific residues and compute distances - function

I'm trying to extract the exact coordinates for an atom in specific residues in a PDB file.
The idea is to use a function that finds the coordinates of all oxygens of the cysteines in a file, find the nitrogen coordinates on the histidines in the same file, and calculate the distances between them.
import sys
import os
os.getcwd()
filename = sys.argv[1]
def cys_ox_coord_extr():
with open(filename, "r") as fileobj:
content = fileobj.readlines()
print("Cysteine's oxygen coordinates\n")
for line in content:
if line.startswith("ATOM"):
if str(line.split()[3]) == "CYS" and str(line.split()[2]) == "O":
cys_ox = [line.split()[5], [line.split()[6], line.split()[7], line.split()[8]]]
print(cys_ox)
The function works fine, and I can get the coordinates for this atom type, however, I can't use these results elsewhere.
I can't recall "cys_ox" anywhere outside the function itself, even using `return cys_ox.
Yet, if I print it, the results come out fine.
My goal is to get the coordinates for another atom type in another residue and compute the distances, but it is impossible if I can't use the results outside the function which generates them.
What can I do?
Thank you!!!

as Giacomo suggested,
define a list outside the function, and append cys_ox to the list. So the list has global scope and you can access outside the function. Else you should define global cys_ox at start of function, so python know that such variable should have global scope (and not the default function scope), but you may have many cys_ox, so a list is better. –
Giacomo Catenazzi

Related

Data frame error when converting iGraph to gexf object

I am trying to convert an iGraph object to a gexf object using the rgexf package so that I can write a file usable with Gephi, which I prefer for network visualization.
My iGraph object is created by reading in two CSVs: h.edges and h.nodes. There are both edge and node attributes. Once the files are read in, I create the iGraph object, calculate centrality measures and then attach the centrality measures as node attributes. The code looks like so:
iNet = graph_from_data_frame(d=h.edges, vertices = h.nodes, directed = F)
V(iNet)$degree = degree(iNet)
V(iNet)$eig = evcent(iNet)$vector
V(iNet)$betweenness = betweenness(iNet)
This appears to be working fine since I can do all the normal iGraph functions -- plot, calculate centralities, identify communities, etc. My problem comes when I try to convert this to a gexf object. I run the following code:
library(rgexf)
iNet.gexf igraph.to.gexf(iNet)
But get the below error message:
Error in `[.data.frame`(x, r, vars, drop = drop) :
undefined columns selected
Anyone know what's happening? Although I know the example here can all be done just by uploading the two CSVs straight to Gephi and running the calculations there, the end goal is to be able to attach iGraph's more robust calculations as attributes in ways that Gephi can't.

How to select a specific .m function when two exist?

First, here the way i'm calling the function :
eval([functionName '(''stringArg'')']); % functionName = 'someStringForTheFunctionName'
Now, I have two functionName functions in my path, one that take the stringArg and another one that takes something else. I'm getting some errors because right now the first one it finds is the function that doesn't take the stringArg. Considering the way i'm calling the functionName function, how is it possible to call the correct function?
Edit:
I tried the function which :
which -all someStringForTheFunctionName
The result :
C:\........\x\someStringForTheFunctionName
C:\........\y\someStringForTheFunctionName % Shadowed
The shadowed function is the one i want to call.
Function names must be unique in MATLAB. If they are not, so there are duplicate names, then MATLAB uses the first one it finds on your search path.
Having said that, there are a few options open to you.
Option 1. Use # directories, putting each version in a separate directory. Essentially you are using the ability of MATLAB to apply a function to specific classes. So, you might set up a pair of directories:
#char
#double
Put your copies of myfun.m in the respective directories. Now when MATLAB sees a double input to myfun, it will direct the call to the double version. When MATLAB gets char input, it goes to the char version.
BE CAREFUL. Do not put these # directories explicitly on your search path. DO put them INSIDE a directory that is on your search path.
A problem with this scheme is if you call the function with a SINGLE precision input, MATLAB will probably have a fit, so you would need separate versions for single, uint8, int8, int32, etc. You cannot just have one version for all numeric types.
Option 2. Have only one version of the function, that tests the first argument to see if it is numeric or char, then branches to perform either task as appropriate. Both pieces of code will most simply be in one file then. The simple scheme will have subfunctions or nested functions to do the work.
Option 3. Name the functions differently. Hey, its not the end of the world.
Option 4: As Shaun points out, one can simply change the current directory. MATLAB always looks first in your current directory, so it will find the function in that directory as needed. One problem is this is time consuming. Any time you touch a directory, things slow down, because there is now disk input needed.
The worst part of changing directories is in how you use MATLAB. It is (IMHO) a poor programming style to force the user to always be in a specific directory based on what code inputs they wish to run. Better is a data driven scheme. If you will be reading in or writing out data, then be in THAT directory. Use the MATLAB search path to categorize all of your functions, as functions tend not to change much. This is a far cleaner way to work than requiring the user to migrate to specific directories based on how they will be calling a given function.
Personally, I'd tend to suggest option 2 as the best. It is clean. It has only ONE main function that you need to work with. If you want to keep the functions district, put them as separate nested or sub functions inside the main function body. Inside of course, they will have distinct names, based on how they are driven.
OK, so a messy answer, but it should do it. My test function was 'echo'
funcstr='echo'; % string representation of function
Fs=which('-all',funcstr);
for v=1:length(Fs)
if (strcmp(Fs{v}(end-1:end),'.m')) % Don''t move built-ins, they will be shadowed anyway
movefile(Fs{v},[Fs{v} '_BK']);
end
end
for v=1:length(Fs)
if (strcmp(Fs{v}(end-1:end),'.m'))
movefile([Fs{v} '_BK'],Fs{v});
end
try
eval([funcstr '(''stringArg'')']);
break;
catch
if (strcmp(Fs{v}(end-1:end),'.m'))
movefile(Fs{v},[Fs{v} '_BK']);
end
end
end
for w=1:v
if (strcmp(Fs{v}(end-1:end),'.m'))
movefile([Fs{v} '_BK'],Fs{v});
end
end
You can also create a function handle for the shadowed function. The problem is that the first function is higher on the matlab path, but you can circumvent that by (temporarily) changing the current directory.
Although it is not nice imo to change that current directory (actually I'd rather never change it while executing code), it will solve the problem quite easily; especially if you use it in the configuration part of your function with a persistent function handle:
function outputpars = myMainExecFunction(inputpars)
% configuration
persistent shadowfun;
if isempty(shadowfun)
funpath1 = 'C:\........\x\fun';
funpath2 = 'C:\........\y\fun'; % Shadowed
curcd = cd;
cd(funpath2);
shadowfun = #fun;
cd(curcd); % and go back to the original cd
end
outputpars{1} = shadowfun(inputpars); % will use the shadowed function
oupputpars{2} = fun(inputparts); % will use the function highest on the matlab path
end
This problem was also discussed here as a possible solution to this problem.
I believe it actually is the only way to overload a builtin function outside the source directory of the overloading function (eg. you want to run your own sum.m in a directory other than where your sum.m is located.)
EDIT: Old answer no longer good
The run command won't work because its a function, not a script.
Instead, your best approach would be honestly just figure out which of the functions need to be run, get the current dir, change it to the one your function is in, run it, and then change back to your start dir.
This approach, while not perfect, seems MUCH easier to code, to read, and less prone to breaking. And it requires no changing of names or creating extra files or function handles.

Call a function that is not on the Matlab path WITHOUT ADDING THAT PATH

I have been searching an entire afternoon and have found no solution to call in matlab a function by specifying its path and not adding its directory to the path.
This question is quite similar to Is it possible to call a function that is not in the path in MATLAB?, but in my case, I do not want to call a built-in function, but just a normal function as defined in an m-file.
I think handles might be a solution (because apparently they can refer to functions not on the path), but I again found no way to create a handle without cd-ing to the directory, creating it there and the cd-ing back. Trying to 'explore' what a function handle object is and how to make one with a reference to a specific function not on the path has led me nowhere.
So the solution might come from two angles:
1) You know how to create a handle for an m-file in a specific directory.
2) You know a way to call a function not on the matlab path.
EDIT: I have just discovered the function functions(myhandle) which actually lets you see the filepath to which the handle is referring. But still no way to modify it though...
This is doable, but requires a bit of parsing, and a call to evalin.
I added (many years ago!) a function to the MATLAB Central File Exchange called externalFcn
http://www.mathworks.com/matlabcentral/fileexchange/4361-externalfcn
that manages calls to off-path functions. For instance, I have a function called offpathFcn that simply returns a structure with a success message, and the value of an input. Storing that function off my MATLAB path, I can call it using:
externalfcn('out = C:\MFILES_OffPath\offpathFcn(''this is a test'')');
This returns:
out =
success: 1
input: 'this is a test'
(Note that my implementation is limited, and improvable; you have to include an output with an equal sign for this to work. But it should show you how to achieve what you want.)
(MathWorks application engineer)
The solution as noted in the comment 1 to create a function handle before calling the function is nicely implemented by #Rody Oldenhuis' FEX Contribution:
http://www.mathworks.com/matlabcentral/fileexchange/45941-constructor-for-functionhandles
function [varargout]=funeval(fun,varargin)
% INPUT:
% fun: (char) full path to function file
curdir=cd;
[fundir,funname]=fileparts(fun);
cd(fundir);
[varargout{1:nargout}] =feval(funname,varargin{:})
cd(curdir);
I've modified Thierry Dalon's code to avoid the use of feval, which I always feel uncomfortable with. Note this still doesn't get around cd-ing to the directory in question, but well, it happens behind the scenes, so pretend it doesn't happen :-)
Also note what Ben Voigt pointed out above: calls to helper functions off the path will fail.
function [varargout]=funeval(FunctionHandle, FunctionPath, varargin)
% INPUT:
% FunctionHandle: handle to the function to be called; eg #MyFunction
% FunctionPath: the path to that function
% varargin: the arguments to be passed to Myfunction
curdir=cd;
cd(FunctionPath)
[varargout{1:nargout}] = FunctionHandle(varargin{:});
cd(curdir);
end
and calling it would look like
Output = funeval(#MyFunction, 'c:\SomeDirOffMatlabsPath\', InputArgToMyFunc)
The run command can run a script file from any directory, but it can't call a function (with input and output arguments).
Neither feval nor str2func permit directory information in the function string.
I suggest writing your own wrapper for str2func that:
saves the working directory
changes directory to the script directory
creates a function handle
restores the original working directory
Beware, however, that a handle to a function not in the path is likely to break, because the function will be unable to invoke any helper code stored in other files in its directory.

'Invalid Handle object' when using a timer inside a function in MatLab

I am using a script in MatLab that works perfectly fine by itself, but I need to make a function out of it.
The script read a .csv file, extract all values, start a timer, and at each tick displays the corresponding coordinates extracted from the .csv, resulting in a 3D animation of my graph.
What I would like is to give it the location of the .csv, so that it starts displaying the graphs for this csv.
Here is what I have come up with:
function handFig(fileLoc)
csv=csvread(fileLoc,1,0);
both = csv(:,2:19);
ax=axes;
set(ax,'NextPlot','replacechildren');
Dt=0.1; %sampling period in secs
k=1;
hp1=text(both(k,1),both(k,2),both(k,3),'thumb'); %get handle to dot object
hold on;
hp2=text(both(k,4),both(k,5),both(k,6),'index');
hp3=text(both(k,7),both(k,8),both(k,9),'middle');
hp4=text(both(k,10),both(k,11),both(k,12),'ring');
hp5=text(both(k,13),both(k,14),both(k,15),'pinky');
hp6=text(both(k,16),both(k,17),both(k,18),'HAND');
L1=plot3([both(k,1),both(k,16)],[both(k,2),both(k,17)],[both(k,3),both(k,18)]);
L2=plot3([both(k,4),both(k,16)],[both(k,5),both(k,17)],[both(k,6),both(k,18)]);
L3=plot3([both(k,7),both(k,16)],[both(k,8),both(k,17)],[both(k,9),both(k,18)]);
L4=plot3([both(k,10),both(k,16)],[both(k,11),both(k,17)],[both(k,12),both(k,18)]);
L5=plot3([both(k,13),both(k,16)],[both(k,14),both(k,17)],[both(k,15),both(k,18)]);
hold off;
t1=timer('TimerFcn','k=doPlot(hp1,hp2,hp3,hp4,hp5,hp6,L1,L2,L3,L4,L5,both,t1,k)','Period', Dt,'ExecutionMode','fixedRate');
start(t1);
end
And the doplot function used:
function k=doPlot(hp1,hp2,hp3,hp4,hp5,hp6,L1,L2,L3,L4,L5,pos,t1,k)
k=k+1;
if k<5000%length(pos)
set(hp1,'pos',[pos(k,1),pos(k,2),pos(k,3)]);
axis([0 255 0 255 0 255]);
set(hp2,'pos',[pos(k,4),pos(k,5),pos(k,6)]);
set(hp3,'pos',[pos(k,7),pos(k,8),pos(k,9)]);
set(hp4,'pos',[pos(k,10),pos(k,11),pos(k,12)]);
set(hp5,'pos',[pos(k,13),pos(k,14),pos(k,15)]);
set(hp6,'pos',[pos(k,16),pos(k,17),pos(k,18)]);
set(L1,'XData',[pos(k,1),pos(k,16)],'YData',[pos(k,2),pos(k,17)],'ZData',[pos(k,3),pos(k,18)]);
set(L2,'XData',[pos(k,4),pos(k,16)],'YData',[pos(k,5),pos(k,17)],'ZData',[pos(k,6),pos(k,18)]);
set(L3,'XData',[pos(k,7),pos(k,16)],'YData',[pos(k,8),pos(k,17)],'ZData',[pos(k,9),pos(k,18)]);
set(L4,'XData',[pos(k,10),pos(k,16)],'YData',[pos(k,11),pos(k,17)],'ZData',[pos(k,12),pos(k,18)]);
set(L5,'XData',[pos(k,13),pos(k,16)],'YData',[pos(k,14),pos(k,17)],'ZData',[pos(k,15),pos(k,18)]);
else
k=1;
set(hp1,'pos',[pos(k,1),pos(k,2),pos(k,3)]);
axis([0 255 0 255 0 255]);
set(hp2,'pos',[pos(k,4),pos(k,5),pos(k,6)]);
set(hp3,'pos',[pos(k,7),pos(k,8),pos(k,9)]);
set(hp4,'pos',[pos(k,10),pos(k,11),pos(k,12)]);
set(hp5,'pos',[pos(k,13),pos(k,14),pos(k,15)]);
set(hp6,'pos',[pos(k,16),pos(k,17),pos(k,18)]);
end
However, when I run handFig('fileName.csv'), I obtain the same error everytime:
??? Error while evaluating TimerFcn for timer 'timer-7'
Invalid handle object.
I figured that it might come from the function trying to create a new 'csv' and 'both' everytime, so I tried removing them, and feeding the function the data directly, without results.
What is exactly the problem? Is there a solution?
Thanks a lot!
I think it's because when you call doPlot in the timer for the first time, you pass in t1 as an argument, and it might not exist the first time.
Does doPlot need t1 at all? I'd suggest modifying it so it's not used, and then your call to:
t1=timer('TimerFcn','k=doPlot(hp1,hp2,hp3,hp4,hp5,hp6,L1,L2,L3,L4,L5,both,k)','Period', Dt,'ExecutionMode','fixedRate');
Note the missing t1 in the doPlot call.
Either that, or initialise your t1 before you create the timer so it has some value to pass in.
Update (as an aside, can you use pause(Dct) in a loop instead? seems easier)
Actually, now I think it's a problem of scope.
It took a bit of digging to get to this, but looking at the Matlab documentation for function callbacks, it says:
When MATLAB evaluates function handles, the same variables are in scope as when the function handle was created. (In contrast, callbacks specified as strings are evaluated in the base workspace.)
You currently give your TimerFcn argument as a string, so k=doPlot(...) is evaluated in the base workspace. If you were to go to the matlab prompt, run handFig, and then type h1, you'd get an error because h1 is not available in the global workspace -- it's hidden inside handFig.
That's the problem you're running into.
However, the workaround is to specify your function as a function handle rather than a string (it says function handles are evaluated in the scope in which they are created, ie within handFig).
Function handles to TimerFcn have to have two arguments obj and event (see Creating Callback Functions). Also, that help file says you have to put doPlot in its own m-file to have it not evaluate in the base Matlab workspace.
In addition to these two required input arguments, your callback
function can accept application-specific arguments. To receive these
input arguments, you must use a cell array when specifying the name of
the function as the value of a callback property. For more
information, see Specifying the Value of Callback Function Properties.
It goes through an example of what you have to do to get this working. Something like:
% create timer
t = timer('Period', Dt,'ExecutionMode','fixedRate');
% attach `k` to t so it can be accessed within doPlot
set(t,'UserData',k);
% specify TimerFcn and its extra arguments:
t.TimerFcn = { #doPlot, hp1, hp2, hp3, ...., both };
start(t)
Note -- the reason k is set in UserData is because it needs to be somehow saved and modified between calls to doPlot.
Then modify your doPlot to have two arguments at the beginning (which aren't used), and not accept the k argument. To extract k you do get(timer_obj,'UserData') from within doPlot:
function k=doPlot(timer_obj, event, hp1,hp2,hp3,.....)
k = get(timer_obj,'UserData');
.... % rest of code here.
% save back k so it's changed for next time!
set(timer_obj,'UserData',k);
I think that's on the right track - play around with it. I'd highly recommend the mathworks forums for this sort of thing too, those people are whizzes.
This thread from the mathworks forum was what got me started and might prove helpful to you.
Good luck!

Accessing the Body of a Function with Lua

I'm going back to the basics here but in Lua, you can define a table like so:
myTable = {}
myTable [1] = 12
Printing the table reference itself brings back a pointer to it. To access its elements you need to specify an index (i.e. exactly like you would an array)
print(myTable ) --prints pointer
print(myTable[1]) --prints 12
Now functions are a different story. You can define and print a function like so:
myFunc = function() local x = 14 end --Defined function
print(myFunc) --Printed pointer to function
Is there a way to access the body of a defined function. I am trying to put together a small code visualizer and would like to 'seed' a given function with special functions/variables to allow a visualizer to 'hook' itself into the code, I would need to be able to redefine the function either from a variable or a string.
There is no way to get access to body source code of given function in plain Lua. Source code is thrown away after compilation to byte-code.
Note BTW that function may be defined in run-time with loadstring-like facility.
Partial solutions are possible — depending on what you actually want to achieve.
You may get source code position from the debug library — if debug library is enabled and debug symbols are not stripped from the bytecode. After that you may load actual source file and extract code from there.
You may decorate functions you're interested in manually with required metadata. Note that functions in Lua are valid table keys, so you may create a function-to-metadata table. You would want to make this table weak-keyed, so it would not prevent functions from being collected by GC.
If you would need a solution for analyzing Lua code, take a look at Metalua.
Check out Lua Introspective Facilities in the debugging library.
The main introspective function in the
debug library is the debug.getinfo
function. Its first parameter may be a
function or a stack level. When you
call debug.getinfo(foo) for some
function foo, you get a table with
some data about that function. The
table may have the following fields:
The field you would want is func I think.
Using the debug library is your only bet. Using that, you can get either the string (if the function is defined in a chunk that was loaded with 'loadstring') or the name of the file in which the function was defined; together with the line-numbers at which the function definition starts and ends. See the documentation.
Here at my current job we have patched Lua so that it even gives you the column numbers for the start and end of the function, so you can get the function source using that. The patch is not very difficult to reproduce, but I don't think I'll be allowed to post it here :-(
You could accomplish this by creating an environment for each function (see setfenv) and using global (versus local) variables. Variables created in the function would then appear in the environment table after the function is executed.
env = {}
myFunc = function() x = 14 end
setfenv(myFunc, env)
myFunc()
print(myFunc) -- prints pointer
print(env.x) -- prints 14
Alternatively, you could make use of the Debug Library:
> myFunc = function() local x = 14 ; debug.debug() end
> myFunc()
> lua_debug> _, x = debug.getlocal(3, 1)
> lua_debug> print(x) -- prints 14
It would probably be more useful to you to retrieve the local variables with a hook function instead of explicitly entering debug mode (i.e. adding the debug.debug() call)
There is also a Debug Interface in the Lua C API.