At what point does CUPS actually call the rasterizer in the printing process? - cups

I am trying to tap into the CUPS raster and obtain some lower level info such as pixel data, color mode, bits per pixel, bits per color, and anything else really. I can't figure out how CUPS uses the raster. Whenever I print something to PDF it never goes through any of the functions in the filter/raster.c file.
Is my approach/reasoning incorrect? I've tried printing images (png), text and PDF and the result is the same.

CUPS does not have any component called 'rasterizer'.
When CUPS needs to process a submitted file (you can print on the command line, like 'lp -d printername the.file', did you know?), ...
...the first thing it does is auto-typing the incoming file in order to determine its mime type;
...next, it checks which target print queue the user requested ('printername' in above commend); each target printer requires its own file format, which also is a mime type of its own (that is of course different for a PCL, a PostScript, an ESC/P, a GDI, a proprietary "whatever" or even a PDF-consuming printer);
...based on input and required final output file types of the current job, CUPS constructs an appropriate filtering chain and runs the input data through these filters.
You can follow the course of these conversions by enabling LogLevel debug in /etc/cups/cupsd.conf (restart CUPS daemon after modifying this). Then, check the log file:
less /var/log/cups/error_log
This will now show lines containing 'Started filter /usr/lib/cups/filter/...' indicating the time when each filter in the chain is started.
The raster/raster.c source code file contains code which is used if the filtering chain contains any of the ABCDtoraster or rastertoXYZ filters. These filters may or may not be present on your system in the directory /usr/lib/cups/filter/, and they create or post-process a CUPS-specific raster format defined here: https://www.cups.org/doc/spec-raster.html

Related

Changing output dataset path in the transform function

Can we change the output dataset path dynamically in the my_compute_function as show below
from transforms.api import transform, Input, Output
#transform(
my_output=Output("/path/to/my/dataset"),
my_input=Input("/path/to/input"),
)
def my_compute_function(my_output, my_input):
**my_output.path = "new path"**
my_output.write_dataframe(
my_input.dataframe()
)
No, this is not possible. The reason is that the inputs/outputs/transforms are fixed at "CI-time" or "build-time". When you press "commit" in Authoring or you merge a PR, a CI job is kicked off.
In this CI job, all the relations between inputs and outputs are determined. Output datasets that don't exist yet are created, and a "jobspec" is added to them. A "jobspec" is a snippet of JSON that describes to foundry how a particular dataset is generated.
Anytime you press the "build" button on a dataset (or build the dataset through a schedule or similar), the jobspec is consulted. It contains a reference to the repository, revision, source file and entry point of the function that builds this dataset. From there the build is orchestrated and kicks off, invoking your function to produce the final output.
This mechanism allows you to get a "static view" of the entire pipeline, which you can then visualize with Monocle, as you might have seen.
Depending on what your needs are, here are some solutions you might be able to use instead:
Tag the rows you're producing in your transform in some way, so that even though you put them into a single dataset, you can later select them by this tag/category
If your set of categories does not change often, you can instead create the output datasets ahead of time and then filter the rows into the appropriate dataset they should go into.
The main drawback with the latter approach is that it's not very dynamic, so if a new category shows up, you'll manually have to change the code to "triage" it into a new dataset, until the data becomes available.
There's other solutions (ultimately it is possible to make API calls and to manually adjust inputs/outputs as well, for instance) but they are more complex and undesirable from a maintenance perspective.

How to interpret tcl command in openOCD manual

I'm completely new to tcl and am trying to understand how to script the command "adapter usb location" in openOCD.
From the openOCD manual, the command has this description:
I want to point it to the port with the red arrow below:
Thanks.
It's not 100% clear, but I would expect (from that snippet of documentation) a bus location to be a dotted “path” something like:
1-6
where the values are:
1 — Bus ID
6 — Port ID
Which would result in a call to the command being done like this:
adapter usb location 1-6
When there's a more complex structure involved (internally because of chained hubs) such as with the item above the one you pointed at, I'd instead expect:
1-5.3
Notice that there are is a sequence of port IDs (5.3) in there to represent the structure. The resulting call would then be:
adapter usb location 1-5.3
Now for the caveats!
I can't tell what the actual format of those IDs is. They might just be numbers, or they might have some textual prefix (e.g., bus1-port6). Those text prefixes, if present, might contain a space (or other metacharacter) which will be deeply annoying to use if true. You should be able to run adapter usb location without any other arguments to see what the current location is; be aware though that it might return the empty string (or give an error) if there is no current location. I welcome feedback on this, as that information appears to be not present in any online documentation I can find (and I don't have things installed so I can't just check).
I also have no idea what (if anything) to do with the device and interface IDs.

Working on migration of SPL 3.0 to 4.2 (TEDA)

I am working on migration of 3.0 code into new 4.2 framework. I am facing a few difficulties:
How to do CDR level deduplication in new 4.2 framework? (Note: Table deduplication is already done).
Where to implement PostDedupProcessor - context or chainsink custom? In either case, do I need to remove duplicate hashcodes from the list or just reject the tuples? Here I am also doing column updating for a few tuples.
My file is not moving into archive. The temporary output file is getting generated and that too empty and outside load directory. What could be the possible reasons? - I have thoroughly checked config parameters and after putting logs, it seems correct output is being sent from transformer custom, so I don't know where it is stuck. I had printed TableRowGenerator stream for logs(end of DataProcessor).
1. and 2.:
You need to select the type of deduplication. It is not a big difference if you choose "table-" or "cdr-level-deduplication".
The ite.businessLogic.transformation.outputType does affect this. There is one Dedup only. You can not have both.
Select recordStream for "cdr-level-deduplication", do the transformation to table row format (e.g. if you like to use the TableFileWriter) in xxx.chainsink.custom::PostContextDataProcessor.
In xxx.chainsink.custom::PostContextDataProcessor you need to add custom code for duplicate-handling: reject (discard) tuples or set special column values or write them to different target tables.
3.:
Possibly reasons could be:
Missing forwarding of window punctuations or statistic tuple
error in BloomFilter configuration, you would see it easily because PE is down and error log gives hints about wrong sha2 functions be used
To troubleshoot your ITE application, I recommend to enable the following debug sinks if checking the StreamsStudio live graph is not sufficient:
ite.businessLogic.transformation.debug=on
ite.businessLogic.group.debug=on
ite.businessLogic.sink.debug=on
Run a test with a single input file only and check the flow of your record and statistic tuples. "Debug sinks" write punctuations markers also to debug files.

Why is SSIS complaining that "There is a partial row at the end of the file"?

I'm importing a flat file into a database using a Data Flow Task in SSIS. The file is very simple: it contains three comma-separated values per row. Whenever I run this task, however, I receive a warning from the Flat File component:
Warning: 0x8020200F: There is a partial row at the end of the file.
This warning seems to happen regardless of the size of the file: even with only a handful of rows in the file, visually validated (with extended characters and whatnot visible) I still receive it. Moreover, it doesn't seem to matter whether I have a blank row at the end of the file or I just end it without a trailing CR+LF.
How can I get rid of this warning so I can run my package with WarnAsError enabled?
(BTW, it seems someone else may have had a similar problem in There is a partial row at the end of the file, though it wasn't much of a question.)
I have found three things to try if you encounter this problem. In at least two out of the three cases, SSIS was ignoring rows of my input file with only the above warning to show for it. Because of that, I do not recommend ignoring this warning!
Step 1: verify that your flat file is valid
This error will appear when you have an invalid input file. This can be especially hard to detect if your input file has millions of lines, as mine do, but it's vital that you discover file format violations because SSIS will happily give you this warning and continue on its way without importing the offending lines or, in some cases, the lines after the offending lines. The easiest way I found to discover a problem with the source file is to check the number of rows that are being imported successfully. If it's vastly different than the number you expect in your flat file, something may have gone wrong in the middle somewhere.
Step 2: try a dummy line at the end (fixed-width only)
If you are using a fixed-width format input file, Microsoft may have a helpful KB article for you. Basically, they suggest that you add a dummy line at the end of the file.
I am not using fixed-width files, so I can't say how useful this technique is.
Step 3: turn off text qualification for non-text
This is the tricky one, because I believe the TextQualified property is True by default. If your input file uses non-text fields (integers, etc.), then you must tell SSIS that it should not expect those columns to be qualified as text. Essentially, your input file will be invalid in spite of looking perfectly valid.
TextQualified is a property of the columns in your Flat File Connection Manager.
To change it, open up your connection manager, click "Advanced", and then click on a non-text column. Make sure the TextQualified property is set to False. You will need to do this for all of your non-text columns.
If the byte width of a line in the file is known, you can always double check that the total byte size of the file can be divided by the expected line size to give you a nice round line count number (as opposed to a decimal).
It helps also to know from your source just how many records are expected, but if you don't have this you can at least double check the resultant loaded tables record count against the calculation of line count while loading the file.
I've seen this error often when a source flat text file is missing it's last \r\n at the end of the file.
Running on Windows 64 bit is perfect. It led to no missing row, but I lost the last row when running on Windows 2008.
My workaround is
1. open the ssis in BIDs on the Windows 2008.
2. open the file connection manager make sure Text Qualifier set to
3. rebuild it
All work fine in both Windows 7 and Windows 2008.

Organizing Notebooks & Saving Results in Mathematica

As of now I use 3 Notebook :
Functions
Where I have all the functions I created and call in the other Notebooks.
Transformation
Based on the original data, I compute transformations and add columns/List
When data is my raw data, I then call :
t1data : the result of the first transformation
t2data : the result of the second transformation
and so on,
I am yet at t20.
Display & Analysis
Using both the above I create Manipulate object that enable me to analyze the data.
Questions
Is there away to save the results of the Transformation Notebook such that t13data for example can be used in the Display & Analysis Notebooks without running all the previous computations (t1,t2,t3...t12) it is based on ?
Is there a way to use my Functions or transformed data without opening the corresponding Notebook ?
Does my separation strategy make sense at all ?
As of now I systematically open the 3 and have to run them all before being able to do anything, and it takes a while given my poor computing power and yet inefficient codes.
Saving variable states: can be done using DumpSave, Save or Put. Read back using Get or <<
You could make a package from your functions and read those back using Needs or <<
It's not something I usually do. I opt for a monolithic notebook containing everything (nicely layered with sections and subsections so that you can fold open or close) or for a package + slightly leaner analysis notebook depending on the weather and some other hidden variables.
Saving intermediate results
The native file format for Mathematica expressions is the .m file. This is human readable text format, and you can view the file in a text editor if you ever doubt what is, or is not being saved. You can load these files using Get. The shorthand form for Get is:
<< "filename.m"
Using Get will replace or refresh any existing assignments that are explicitly made in the .m file.
Saving intermediate results that are simple assignments (dat = ...) may be done with Put. The shorthand form for Put is:
dat >> "dat.m"
This saves only the assigned expression itself; to restore the definition you must use:
dat = << "dat.m"
See also PutAppend for appending data to a .m file as new results are created.
Saving results and function definitions that are complex assignments is done with Save. Examples of such assignments include:
f[x_] := subfunc[x, 2]
g[1] = "cat"
g[2] = "dog"
nCr = #!/(#2! (# - #2)!) &;
nPr = nCr[##] #2! &;
For the last example, the complexity is that nPr depends on nCr. Using Save it is sufficient to save only nPr to get a fully working definition of nPr: the definition of nCr will automatically be saved as well. The syntax is:
Save["nPr.m", nPr]
Using Save the assignments themselves are saved; to restore the definitions use:
<< "nPr.m" ;
Moving functions to a Package
In addition to Put and Save, or manual creation in a text editor, .m files may be generated automatically. This is done by creating a Notebook and setting Cell > Cell Properties > Initialization Cell on the cells that contain your function definitions. When you save the Notebook for the first time, Mathematica will ask if you want to create an Auto Save Package. Do so, and Mathematica will generate a .m file in parallel to the .nb file, containing the contents of all Initialization Cells in the Notebook. Further, it will update this .m file every time you save the Notebook, so you never need to manually update it.
Sine all Initialization Cells will be saved to the parallel .m file, I recommend using the Notebook only for the generation of this Package, and not also for the rest of your computations.
When managing functions, one must consider context. Not all functions should be global at all times. A series of related functions should often be kept in its own context which can then be easily exposed to or removed from $ContextPath. Further, a series of functions often rely on subfunctions that do not need to be called outside of the primary functions, therefore these subfunctions should not be global. All of this relates to Package creation. Incidentally, it also relates to the formatting of code, because knowing that not all subfunctions must be exposed as global gives one the freedom to move many subfunctions to the "top level" of the code, that is, outside of Module or other scoping constructs, without conflicting with global symbols.
Package creation is a complex topic. You should familiarize yourself with Begin, BeginPackage, End and EndPackage to better understand it, but here is a simple framework to get you started. You can follow it as a template for the time being.
This is an old definition I used before DeleteDuplicates existed:
BeginPackage["UU`"]
UnsortedUnion::usage = "UnsortedUnion works like Union, but doesn't \
return a sorted list. \nThis function is considerably slower than \
Union though."
Begin["`Private`"]
UnsortedUnion =
Module[{f}, f[y_] := (f[y] = Sequence[]; y); f /# Join###] &
End[]
EndPackage[]
Everything above goes in Initialization Cells. You can insert Text cells, Sections, or even other input cells without harming the generated Package: only the contents of the Initialization Cells will be exported.
BeginPackage defines the Context that your functions will belong to, and disables all non-System` definitions, preventing collisions. (There are ways to call other functions from your package, but that is better for another question).
By convention, a ::usage message is defined for each function that it to be accessible outside the package itself. This is not superfluous! While there are other methods, without this, you will not expose your function in the visible Context.
Next, you Begin a context that is for the package alone, conventionally "`Private`". After this point any symbols you define (that are not used outside of this Begin/End block) will not be exposed globally after the Package is loaded, and will therefore not collide with Global` symbols.
After your function definition(s), you close the block with End[]. You may use as many Begin/End blocks as you like, and I typically use a separate one for each function, though it is not required.
Finally, close with EndPackage[] to restore the environment to what it was before using BeginPackage.
After you save the Notebook and generate the .m package (let's say "mypackage.m"), you can load it with Get:
<< "mypackage.m"
Now, there will be a function UnsortedUnion in the Context UU` and it will be accessible globally.
You should also look into the functionality of Needs, but that is a little more advanced in my opinion, so I shall stop here.