ImageJ Macro for converting multichannel Tiffs to Tiffs with only specified channels - tiff

I have a quite simple programming question I was hoping somebody could help me with.
I'm working with Tiff files with several channels (all contained in a .lif file, which is Leica format). I want a way to easily convert all my Tiffs to Tiffs containing only a few of the channels (which I specify). Right now I'm doing it manually, for each image and it is tedious. I have no experience in writing macros and some help or a starting point would be much appreciated. I'm sure it's not a complicated macro to write.
As of now I'm using the following manual routine and commands after I have opened all my Tiffs:
Image > Stacks > Stack to Images - separates the stacked imaged into individual images
Close images I dont wan't to be in stack.
Image > Stacks > Images to Stack - Returns the remaining images to a stack and renames it.
Image > Hyperstacks > Stack to Hyperstack - here I change it so that the image has 3 channels.
Save the new Tiff with the desired channels and name.
Close the Tiff and repeat for all Tiffs.
What I want is a Macro that loops the above steps for all open Tiffs, letting the user specify the channels (eg. keep channels: 2,3 and 5). I know it's a very simple programming task, but I could really use some help getting it done.
Thanks!
Johannes

There are several less complex possibilities to create a stack with only a subset of channels:
Image > Stacks > Tools > Make Substack..., which lets you specify the channels/slices, and gets recorded as:
run("Make Substack...", "channels=1,3-5");
Image > Duplicate..., where you can select a continuous range of channels, such as:
run("Duplicate...", "duplicate channels=1-5");
To apply this procedure to all images in a folder, have a look at the Process Folder template in the Script Editor (Templates > IJ1 Macro > Process Folder) and at the documentation on the Fiji wiki:
Scripting toolbox
How to apply a common operation to a complete directory

Thanks for the help Jan Eglinger, back from vacation I managed to write the macro, which was simple with your help :) Based on the template it looks like this (I just gave them incremental names which is fine for my purpose, but could be made more allround i guess):
/*
* Macro to for converting multichannel Tiffs to Tiffs with only specified channels, processes multiple images in a folder
*/
input = getDirectory("Input directory");
output = getDirectory("Output directory");
Dialog.create("File type");
Dialog.addString("File suffix: ", ".tif", 5);
Dialog.show();
suffix = Dialog.getString();
processFolder(input);
function processFolder(input) {
list = getFileList(input);
for (i = 0; i < list.length; i++) {
if(File.isDirectory(input + list[i]))
processFolder("" + input + list[i]);
if(endsWith(list[i], suffix))
processFile(input, output, list[i]);
}
}
function processFile(input, output, file) {
open(input + file);
print("Processing: " + input + file);
run("Make Substack...", "channels=1,2,4"); //Specify which channels should be in the final tif
print("Saving to: " + output);
saveAs("Tiff", output + i);
close("*");
}

Related

Can I modify the total number of pages in multi-page tiff?

I am receiving data from a camera and saving each image as a page in a multi-page tiff. I can set that each file has e.g. 100 pages and I am calling:
TIFFSetField(out, TIFFTAG_PAGENUMBER, page_number, total_pages);
However, if I am unable to write data to disk fast enough, I will stop the acquisition. At this point I may have written 50 out of 100 pages into a multi-page tiff. Now the multi-page tiff file reports total number of pages as 100, but only 50 pages have been actually written. Some applications will report 100 pages, but for pages 51-100 there will be no data and images will appear to be black.
Therefore I would need to update the total_pages number at the moment when I am ending the disk writing to the value of the last written page.
Can this be done at all? Is the total_pages value written once into a common header which I could update and fix the file in this way, or is this value written into each page which means I would have to edit each page that has already been written to disk? Or is there any better approach how to handle this?
Actually, the solution is quite simple.
Once your image stream has ended and before you close the file, you have to iterate through all the directories (images in the multi-page tiff) and update the TIFFTAG_PAGENUMBER to the total pages written.
The catch is you have to do it before you close the tiff by calling TIFFClose. Once a TIFF is closed, its TAGS cannot be edited anymore. (see http://www.libtiff.org/libtiff.html):
Note that unlike the stdio library TIFF image files may not be opened for both reading and writing; there is no support for altering the contents of a TIFF file.
if (pagesTotal - pagesWritten > 0)
{
for (int i = 0; i < pagesWritten; i++)
{
int retVal = TIFFSetDirectory(out, i);
retVal = TIFFSetField(out, TIFFTAG_PAGENUMBER, i, pagesWritten);
retVal = TIFFWriteDirectory(out);
}
}
TIFFClose(out);
pagesTotal is number of pages that we inteded to write into this multi-page file
pagesWritten is number of pages that we actually wrote into the file

Append string to a Flash AS3 shared object: For science

So I have a little flash app I made for an experiment where users interact with the app in a lab, and the lab logs the interactions.
The app currently traces a timestamp and a string when the user interacts, it's a useful little data log in the console:
trace(Object(root).my_date + ": User selected the cupcake.");
But I need to move away from using traces that show up in the debug console, because it won't work outside of the developer environment of Flash CS6.
I want to make a log, instead, in a SO ("Shared Object", the little locally saved Flash cookies.) Ya' know, one of these deals:
submit.addEventListener("mouseDown", sendData)
function sendData(evt:Event){
{
so = SharedObject.getLocal("experimentalflashcookieWOWCOOL")
so.data.Title = Title.text
so.data.Comments = Comments.text
so.data.Image = Image.text
so.flush()
}
I don't want to create any kind of architecture or server interaction, just append my timestamps and strings to an SO. Screw complexity! I intend to use all 100kb of the SO allocation with pride!
But I have absolutely no clue how to append data to the shared object. (Cough)
Any ideas how I could create a log file out of a shared object? I'll be logging about 200 lines per so it'd be awkward to generate new variable names for each line then save the variable after 4 hours of use. Appending to a single variable would be awesome.
You could just replace your so.data.Title line with this:
so.data.Title = (so.data.Title is String) ? so.data.Title + Title.text : Title.text; //check whether so.data.Title is a String, if it is append to it, if not, overwrite it/set it
Please consider not using capitalized first letter for instance names (as in Title). In Actionscript (and most C based languages) instance names / variables are usually written with lowercase first letter.

How can I reference a Flash file's own filename from actionscript?

In our pipeline, we ultimately publish HTML5 using Toolkit for CreateJS. However one of the steps to get us to that HTML5 includes publishing an SWF that outputs some Javascript code. I've mostly managed to automate this using JSFL. However, at present there is one line of AS3 that our artists have to find on the timeline and change manually, but it interrupts their workflow, and if they miss it or mess it up it is hard to catch, so I would like to automate it too:
Object(root).log.text += " root.skillAnime189 = factory();\n";
From the above, "skillAnime189.fla" is the name of the .fla file which contains this code. This is the case if the artist is working on Skill Animation #189, but if he is doing #304 or #6 or #1022 (no padding) the number changes accordingly, and he has to update that line accordingly.
So, I would like to change that line to something like:
var flaName:String = getThisFlashFileName().split(".")[0];
Object(root).log.text += " root." + flaName + " = factory();\n";
but I am at a loss as to how to access the name of the .fla file containing the code.
The common way to get the swf name is to parse stage.loaderInfo.url parameter:
var url:String = stage.loaderInfo.url;
url = url.split("?")[0]; //remove query string after "?"
var swfname:String = url.substring(url.lastIndexOf("/")+1);
trace(swfname);
output:
astest.swf
But this code gives swf but, rather than fla, so you need to maintain the same names for flas and published swfs (any case usually they are the same, so it shouldn't be the problem)

Processing ELF relocations - understanding the relocs, symbols, section data, and how they work together

TL;DR
I tried to make this a short question but it's a complicated problem so it ended up being long. If you can answer any part of this or give any suggestions or tips or resources or anything at all, it would be extremely helpful (even if you don't directly solve all my issues). I'm banging my head against the wall right now. :)
Here are the specific issues I am having. Read on below for more information.
I'm looking for guidance on how to process relocation entries and update the unresolved symbols in the section data. I simply don't understand what to do with all the information I've pulled from the relocations and the sections, etc.
I'm also hoping to understand just what is going on when the linker encounters relocations. Trying to correctly implement the relocation equations and use all the correct values in the correct way is incredibly challenging.
When I encounter op codes and addresses and symbols, etc, I need to understand what to do with them. I feel like I am missing some steps.
I feel like I don't have a good grasp on how the symbol table entries interact with the relocations. How should I use the symbol's binding, visibility, value, and size information?
Lastly, when I output my file with the resolved data and new relocation entries used by the executable, the data is all incorrect. I'm not sure how to follow all the relocations and provide all the information necessary. What is the executable expecting from me?
My approach so far
I am trying to create a relocation file in a specific [undocumented] proprietary format that is heavily based on ELF. I have written a tool that takes an ELF file and a partially linked file (PLF) and processes them to output the fully resolved rel file. This rel file is used to load/unload data as needed in order to save memory. The platform is a 32bit PPC. One wrinkle is that tool is written for Windows in c#, but the data is intended for the PPC, so there are fun endian issues and the like to watch out for.
I've been trying to understand how relocations are handled when used to resolve unresolved symbols and so on. What I've done so far is to copy the relevant sections from the PLF and then for each corresponding .rela section, I parse the entries and attempt to fix up the section data and generate new relocation entries as needed. But this is where my difficulty is. I'm way out of my element here and this sort of thing seems to typically be done by linkers and loaders so there's not a lot of good examples to draw upon. But I have found a few that have been of help, including THIS ONE.
So what is happening is:
Copy section data from PLF to be used for rel file. I'm only interested in the .init (no data), .text, .ctors, .dtors, .rodata, .data, .bss (no data), and another custom section we are using.
Iterate over the .rela sections in the PLF and read in the Elf32_Rela entries.
For each entry, I pull out the r_offset, r_info, and r_addend fields and extract the relevant info from r_info (the symbol and the reloc type).
From the PLF's symbol table, I can get the symbolOffset, the symbolSection, and the symbolValue.
From the ELF, I get the symbolSection's load address.
I compute int localAddress = ( .relaSection.Offset + r_offset ).
I get the uint relocValue from the symbolSection's contents at r_offset.
Now I have all the info I need so I do a switch on the reloc type and process the data. These are the types I support:
R_PPC_NONE
R_PPC_ADDR32
R_PPC_ADDR24
R_PPC_ADDR16
R_PPC_ADDR16_LO
R_PPC_ADDR16_HI
R_PPC_ADDR16_HA
R_PPC_ADDR14
R_PPC_ADDR14_BRTAKEN
R_PPC_ADDR14_BRNTAKEN
R_PPC_REL24
R_PPC_REL14
R_PPC_REL14_BRTAKEN
R_PPC_REL14_BRNTAKEN
Now what?? I need to update the section data and build companion relocation entries. But I don't understand what is necessary to do and how to do it.
The whole reason I'm doing this is because there is an old obsolete unsupported tool that does not support using custom sections, which is a key requirement for this project (for memory reasons). We have a custom section that contains a bunch of initialization code (totaling about a meg) that we want to unload after start up. The existing tool just ignores all the data in that section.
So while making our own tool that does support custom sections is ideal, if there are any bright ideas for another way to achieve this goal, I'm all ears! We've floated around an idea of using the .dtor section for our data since it's nearly empty anyway. But this is messy and might not work anyway if it prevents a clean shutdown.
Relocations plus example code
When I process the relocations, I'm working off of the equations and information found in the ABI docs HERE (around section 4.13, page 80ish) as well as a number of other code examples and blog posts I've dug up. But it's all so confusing and not really spelled out and all the code I've found does things a little differently.
For example,
R_PPC_ADDR16_LO --> half16: #lo(S + A)
R_PPC_ADDR14_BRTAKEN --> low14*: (S + A) >> 2
etc
So when I see this kind of code, how do I decipher it?
Here's one example (from this source)
case ELF::R_PPC64_ADDR14 : {
assert(((Value + Addend) & 3) == 0);
// Preserve the AA/LK bits in the branch instruction
uint8_t aalk = *(LocalAddress+3);
writeInt16BE(LocalAddress + 2, (aalk & 3) | ((Value + Addend) & 0xfffc));
} break;
case ELF::R_PPC64_REL24 : {
uint64_t FinalAddress = (Section.LoadAddress + Offset);
int32_t delta = static_cast<int32_t>(Value - FinalAddress + Addend);
if (SignExtend32<24>(delta) != delta)
llvm_unreachable("Relocation R_PPC64_REL24 overflow");
// Generates a 'bl <address>' instruction
writeInt32BE(LocalAddress, 0x48000001 | (delta & 0x03FFFFFC));
} break;
Here's some from another example (here)
case R_PPC_ADDR32: /* word32 S + A */
addr = elf_lookup(lf, symidx, 1);
if (addr == 0)
return -1;
addr += addend;
*where = addr;
break;
case R_PPC_ADDR16_LO: /* #lo(S) */
if (addend != 0) {
addr = relocbase + addend;
} else {
addr = elf_lookup(lf, symidx, 1);
if (addr == 0)
return -1;
}
*hwhere = addr & 0xffff;
break;
case R_PPC_ADDR16_HA: /* #ha(S) */
if (addend != 0) {
addr = relocbase + addend;
} else {
addr = elf_lookup(lf, symidx, 1);
if (addr == 0)
return -1;
}
*hwhere = ((addr >> 16) + ((addr & 0x8000) ? 1 : 0)) & 0xffff;
break;
And one other example (from here)
case R_PPC_ADDR16_HA:
write_be16 (dso, rela->r_offset, (value + 0x8000) >> 16);
break;
case R_PPC_ADDR24:
write_be32 (dso, rela->r_offset, (value & 0x03fffffc) | (read_ube32 (dso, rela->r_offset) & 0xfc000003));
break;
case R_PPC_ADDR14:
write_be32 (dso, rela->r_offset, (value & 0xfffc) | (read_ube32 (dso, rela->r_offset) & 0xffff0003));
break;
case R_PPC_ADDR14_BRTAKEN:
case R_PPC_ADDR14_BRNTAKEN:
write_be32 (dso, rela->r_offset, (value & 0xfffc)
| (read_ube32 (dso, rela->r_offset) & 0xffdf0003)
| ((((GELF_R_TYPE (rela->r_info) == R_PPC_ADDR14_BRTAKEN) << 21)
^ (value >> 10)) & 0x00200000));
break;
case R_PPC_REL24:
write_be32 (dso, rela->r_offset, ((value - rela->r_offset) & 0x03fffffc) | (read_ube32 (dso, rela->r_offset) & 0xfc000003));
break;
case R_PPC_REL32:
write_be32 (dso, rela->r_offset, value - rela->r_offset);
break;
I really want to understand the magic these guys are doing here and why their code doesn't always look the same. I think some of the code is making assumptions that the data was already properly masked (for branches, etc), and some of the code is not. But I don't understand any of this at all.
Following the symbols/data/relocations, etc
When I look at the data in a hexeditor, I see a bunch of "48 00 00 01" all over. I've figured out that this is an opcode and needs to be updated with relocation information (this specifically is for 'bl' branch and link), yet my tool doesn't operate on the vast majority of them and the ones that I do update have the wrong values in them (compared to an example made by an obsolete tool). Clearly I am missing some part of the process.
In addition to the section data, there are additional relocation entries that need to be added to the end of the rel file. These consist of internal and external relocations but I haven't figured out much at all about these yet. (what's the difference between the two and when do you use one or the other?)
If you look near the end of this file at the function RuntimeDyldELF::processRelocationRef, you'll see some Relocation Entries being created. They also make stub functions. I suspect this is the missing link for me, but it's as clear as mud and I'm not following it even a little bit.
When I output the symbols in each relocation entry, they each have a binding/visibility [Global/Weak/Local] [Function/Object] and a value, a size, and a section. I know the section is where the symbol is located, and the value is the offset to the symbol in that section (or is it the virtual address?). The size is the size of the symbol, but is this important? Maybe the global/weak/local is useful for determining if it's an internal or external relocation?
Maybe this relocation table I'm talking about creating is actually a symbol table for my rel file? Maybe this table updates the symbol value from being a virtual address to being a section offset (since that's what the value is in relocatable files and the symbol table in the PLF is basically in an executable)?
Some resources:
blog on relocations: http://eli.thegreenplace.net/2011/08/25/load-time-relocation-of-shared-libraries/
mentions opcodes at the end: http://wiki.netbsd.org/examples/elf_executables_for_powerpc/
my related unanswered question: ELF Relocation reverse engineering
Whew! That's a beast of a question. Congrats if you made it this far. :) Thanks in advance for any help you can give me.
I stumbled across this question and thought it deserved an answer.
Have elf.h handy. You can find it on the internet.
Each RELA section contains an array of Elf32_Rela entries as you know, but is also tied to a certain other section. r_offset is an offset into that other section (in this case - it works differently for shared libraries). You will find that section headers have a member called sh_info. This tells you which section that is. (It's an index into the section header table as you would expect.)
The 'symbol', which you got from r_info is in fact an index into a symbol table residing in another section. Look for the member sh_link in your RELA section's header.
The symbol table tells you the name of the symbol you are looking for, in the form of the st_name member of Elf32_Sym. st_name is an offset into a string section. Which section that is, you get from the sh_link member of your symbol table's section header. Sorry if this gets confusing.
Elf32_Shdr *sh_table = elf_image + ((Elf32_Ehdr *)elf_image)->e_shoff;
Elf32_Rela *relocs = elf_image + sh_table[relocation_section_index]->sh_offset;
unsigned section_to_modify_index = sh_table[relocation_section_index].sh_info;
char *to_modify = elf_image + sh_table[section_to_modify_index].sh_offset;
unsigned symbol_table_index = sh_table[relocation_section_index].sh_link;
Elf32_Sym *symbol_table = elf_image + sh_table[symbol_table_index].sh_offset;
unsigned string_table_index = sh_table[symbol_table].sh_link;
char *string_table = elf_image + sh_table[string_table_index].sh_offset;
Let's say we are working with relocation number i.
Elf32_Rela *rel = &relocs[i];
Elf32_Sym *sym = &symbol_table[ELF32_R_SYM(rel->r_info)];
char *symbol_name = string_table + sym->st_name;
Find the address of that symbol (let's say that symbol_name == "printf"). The final value will go in (to_modify + rel->r_offset).
As for the table on pages 79-83 of the pdf you linked, it tells us what to put at that address, and how many bytes to write. Obviously the address we just got (of printf in this case) is part of most of them. It corresponds to the S in the expressions.
r_addend is just A. Sometimes the compiler needs to add a static constant to a reloc i guess.
B is the base address of the shared object, or 0 for executable programs because they are not moved.
So if ELF32_R_TYPE(rel->r_info) == R_PPC_ADDR32 we have S + A, and the word size is word32 so we would get:
*(uint32_t *)(to_modify + rel->r_offset) = address_of_printf + rel->r_addend;
...and we have successfully performed a relocation.
I can't help you when it comes to the #lo, #hi, etc and the word sizes like low14. I know nothing about PPC but the linked pdf seems reasonable enough.
I also don't know about the stub functions. You don't normally need to know about those when linking (dynamically at least).
I'm not sure if I've answered all of your questions but you should be able to see what your example code does now at least.
Just answering the question: What is relocation?
When you write assembly program, if it is position-dependent, the program will be assumed to be loaded in a specific part of memory. And when you load your program to another memory address, then all the memory relative addressing data will have to be updated so that it will load correctly from the new memory area.
For example:
load A, B
Sometimes B can be an address, sometimes it is just pure constant data. So only compiler will know when they generate the assembly. Compiler will construct a table of all entries containing the offset, and its original loading relative address, assuming a fixed global address where it is laded. This is called the relocation table.
It is linked to the symbol table, PLT table, and GOT etc:
http://blog.k3170makan.com/2018/10/introduction-to-elf-format-part-vi_18.html
https://reverseengineering.stackexchange.com/questions/1992/what-is-plt-got
What does the concept of relocation mean?
How does C++ linking work in practice?
Try to llok into ELF Specification. It takes about 60 pages, and greatly clarifies things. Especially part 2, the one about linking.

Organizing Notebooks & Saving Results in Mathematica

As of now I use 3 Notebook :
Functions
Where I have all the functions I created and call in the other Notebooks.
Transformation
Based on the original data, I compute transformations and add columns/List
When data is my raw data, I then call :
t1data : the result of the first transformation
t2data : the result of the second transformation
and so on,
I am yet at t20.
Display & Analysis
Using both the above I create Manipulate object that enable me to analyze the data.
Questions
Is there away to save the results of the Transformation Notebook such that t13data for example can be used in the Display & Analysis Notebooks without running all the previous computations (t1,t2,t3...t12) it is based on ?
Is there a way to use my Functions or transformed data without opening the corresponding Notebook ?
Does my separation strategy make sense at all ?
As of now I systematically open the 3 and have to run them all before being able to do anything, and it takes a while given my poor computing power and yet inefficient codes.
Saving variable states: can be done using DumpSave, Save or Put. Read back using Get or <<
You could make a package from your functions and read those back using Needs or <<
It's not something I usually do. I opt for a monolithic notebook containing everything (nicely layered with sections and subsections so that you can fold open or close) or for a package + slightly leaner analysis notebook depending on the weather and some other hidden variables.
Saving intermediate results
The native file format for Mathematica expressions is the .m file. This is human readable text format, and you can view the file in a text editor if you ever doubt what is, or is not being saved. You can load these files using Get. The shorthand form for Get is:
<< "filename.m"
Using Get will replace or refresh any existing assignments that are explicitly made in the .m file.
Saving intermediate results that are simple assignments (dat = ...) may be done with Put. The shorthand form for Put is:
dat >> "dat.m"
This saves only the assigned expression itself; to restore the definition you must use:
dat = << "dat.m"
See also PutAppend for appending data to a .m file as new results are created.
Saving results and function definitions that are complex assignments is done with Save. Examples of such assignments include:
f[x_] := subfunc[x, 2]
g[1] = "cat"
g[2] = "dog"
nCr = #!/(#2! (# - #2)!) &;
nPr = nCr[##] #2! &;
For the last example, the complexity is that nPr depends on nCr. Using Save it is sufficient to save only nPr to get a fully working definition of nPr: the definition of nCr will automatically be saved as well. The syntax is:
Save["nPr.m", nPr]
Using Save the assignments themselves are saved; to restore the definitions use:
<< "nPr.m" ;
Moving functions to a Package
In addition to Put and Save, or manual creation in a text editor, .m files may be generated automatically. This is done by creating a Notebook and setting Cell > Cell Properties > Initialization Cell on the cells that contain your function definitions. When you save the Notebook for the first time, Mathematica will ask if you want to create an Auto Save Package. Do so, and Mathematica will generate a .m file in parallel to the .nb file, containing the contents of all Initialization Cells in the Notebook. Further, it will update this .m file every time you save the Notebook, so you never need to manually update it.
Sine all Initialization Cells will be saved to the parallel .m file, I recommend using the Notebook only for the generation of this Package, and not also for the rest of your computations.
When managing functions, one must consider context. Not all functions should be global at all times. A series of related functions should often be kept in its own context which can then be easily exposed to or removed from $ContextPath. Further, a series of functions often rely on subfunctions that do not need to be called outside of the primary functions, therefore these subfunctions should not be global. All of this relates to Package creation. Incidentally, it also relates to the formatting of code, because knowing that not all subfunctions must be exposed as global gives one the freedom to move many subfunctions to the "top level" of the code, that is, outside of Module or other scoping constructs, without conflicting with global symbols.
Package creation is a complex topic. You should familiarize yourself with Begin, BeginPackage, End and EndPackage to better understand it, but here is a simple framework to get you started. You can follow it as a template for the time being.
This is an old definition I used before DeleteDuplicates existed:
BeginPackage["UU`"]
UnsortedUnion::usage = "UnsortedUnion works like Union, but doesn't \
return a sorted list. \nThis function is considerably slower than \
Union though."
Begin["`Private`"]
UnsortedUnion =
Module[{f}, f[y_] := (f[y] = Sequence[]; y); f /# Join###] &
End[]
EndPackage[]
Everything above goes in Initialization Cells. You can insert Text cells, Sections, or even other input cells without harming the generated Package: only the contents of the Initialization Cells will be exported.
BeginPackage defines the Context that your functions will belong to, and disables all non-System` definitions, preventing collisions. (There are ways to call other functions from your package, but that is better for another question).
By convention, a ::usage message is defined for each function that it to be accessible outside the package itself. This is not superfluous! While there are other methods, without this, you will not expose your function in the visible Context.
Next, you Begin a context that is for the package alone, conventionally "`Private`". After this point any symbols you define (that are not used outside of this Begin/End block) will not be exposed globally after the Package is loaded, and will therefore not collide with Global` symbols.
After your function definition(s), you close the block with End[]. You may use as many Begin/End blocks as you like, and I typically use a separate one for each function, though it is not required.
Finally, close with EndPackage[] to restore the environment to what it was before using BeginPackage.
After you save the Notebook and generate the .m package (let's say "mypackage.m"), you can load it with Get:
<< "mypackage.m"
Now, there will be a function UnsortedUnion in the Context UU` and it will be accessible globally.
You should also look into the functionality of Needs, but that is a little more advanced in my opinion, so I shall stop here.