I have images stored in MySQL as blobs (i know it's wrong). And there are many of them. Is there any fast way to drop them all on disk, like SELECT .. INTO OUTFILE, but to many files insted of one? Or the only way is writing a script that will iterate over rows and save images?
Since you want them to be saved into different files on the disk you'll have to go for a script.
#!/usr/bin/perl
#Note: it is my habit to name a Query Result $qR.
use strict;
use DBI;
my $dbh = DBI->connect(YOUR_INFO_HERE);
my $i = 0;
my $q = $dbh->prepare('select image from images');
while (my $qR = $q->fetchrow_arrayref) {
open(FILE,'>',"$i.jpg");
print FILE $qR[0];
close FILE;
$i++;
}
I had a similar requirement , I found in my case using Java + Hibernate (possibly similar task in other Hibernate variations, but not tried it), got me there quite quickly.
I set up a mapping like this:
<hibernate-mapping>
<class name="<com.package.table>" table="table">
<id column="pk" name="pk" type="int">
</id>
<property name="blobfield" type="blob"/>
</class>
</hibernate-mapping>
A Java bean to carry the data, something like:
package com.package;
import java.sql.Blob;
...
public class table {
...
public Blob getBlobfield {
...
And a loop something like this:
...
tx = session.beginTransaction();
Criteria crit = session.createCriteria(table.class);
crit.setMaxResults(50); // Alter this to suit...
List<table> rows = crit.list();
for (table r: rows) {
ExtractBlob(r.getId(), r.getBlobField);
}
And something ("ExtractBlob" is I'm calling this) to extract the blob (using the PK to generate a filename), something like this:
...
FileOutputStream fout=new FileOutputStream(<...base output file on PK for example...>
BufferedOutputStream bos=new BufferedOutputStream(fout);
InputStream is=blob.getBinaryStream();
byte[] b=new byte[8192];
while ( (is.read(b))>0 ) {
bos.write(b);
}
is.close();
bos.close()
;
...
I can post a more complete example if you looks like it might be useful - but I would have extract the code from a bigger project, otherwise I would have just posted it straight up.
Related
I am trying to move data from a SPARQL endpoint to a JSONObject. Using RDF4J.
RDF4J documentation does not address this directly (some info about using endpoints, less about converting to JSON, and nothing where these two cases meet up).
Sofar I have:
SPARQLRepository repo = new SPARQLRepository(<My Endpoint>);
Map<String, String> headers = new HashMap<String, String>();
headers.put("Accept", "SPARQL/JSON");
repo.setAdditionalHttpHeaders(headers);
try (RepositoryConnection conn = repo.getConnection())
{
String queryString = "SELECT * WHERE {GRAPH <urn:x-evn-master:mwadata> {?s ?p ?o}}";
GraphQuery query = conn.prepareGraphQuery(queryString);
debug("Mark 2");
try (GraphQueryResult result = query.evaluate())
this fails because "Server responded with an unsupported file format: application/sparql-results+json"
I figured a SPARQLGraphQuery should take the place of GraphQuery, but RepositoryConnection does not have a relevant prepare statement.
If I exchange
try (RepositoryConnection conn = repo.getConnection())
with
try (SPARQLConnection conn = (SPARQLConnection)repo.getConnection())
I run into the problem that SPARQLConnection does not generate a SPARQLGraphQuery. The closest I can get is:
SPARQLGraphQuery query = (SPARQLGraphQuery)conn.prepareQuery(QueryLanguage.SPARQL, queryString);
which gives a runtime error as these types cannot be cast to eachother.
I do not know how to proceed from here. Any help or advise much appreciated. Thank you
this fails because "Server responded with an unsupported file format: application/sparql-results+json"
In RDF4J, SPARQL SELECT queries are tuple queries, so named because each result is a set of bindings, which are tuples of the form (name, value). In contrast, CONSTRUCT (and DESCRIBE) queries are graph queries, so called because their result is a graph, that is, a collection of RDF statements.
Furthermore, setting additional headers for the response format as you have done here is not necessary (except in rare circumstances), the RDF4J client handles this for you automatically, based on the registered set of parsers.
So, in short, simplify your code as follows:
SPARQLRepository repo = new SPARQLRepository(<My Endpoint>);
try (RepositoryConnection conn = repo.getConnection()) {
String queryString = "SELECT * WHERE {GRAPH <urn:x-evn-master:mwadata> {?s ?p ?o}}";
TupleQuery query = conn.prepareTupleQuery(queryString);
debug("Mark 2");
try (TupleQueryResult result = query.evaluate()) {
...
}
}
If you want to write the result of the query in JSON format, you could use a TupleQueryResultHandler, for example the SPARQLResultsJSONWriter, as follows:
SPARQLRepository repo = new SPARQLRepository(<My Endpoint>);
try (RepositoryConnection conn = repo.getConnection()) {
String queryString = "SELECT * WHERE {GRAPH <urn:x-evn-master:mwadata> {?s ?p ?o}}";
TupleQuery query = conn.prepareTupleQuery(queryString);
query.evaluate(new SPARQLResultsJSONWriter(System.out));
}
This will write the result of the query (in this example to standard output) using the SPARQL Query Results JSON format. If you have a non-standard format in mind, you could of course also create your own TupleQueryResultHandler implementation.
For more details on the various ways in which you can process the result (including iterating, streaming, adding to a List, or just directly sending to a result handler), see the documentation on querying a repository. As an aside, the javadoc on the RDF4J APIs is pretty extensive too, so if your Java editing environment has support for displaying that, I'd advise you to make use of it.
I am working on a BIML project to generate SSIS packages. I have a separate static class for utility methods.
I am attempting to call GetDropAndCreateDdl() to get the DDL from the souce to dynamically create a table in the destination. This should work in theory as it is referenced in multiple posts: here and here as samples.
When generating the BIML, running the sample code below, I receive an error: Error: 'AstTableNode' does not contain a definition for 'GetDropAndCreateDdl' and no accessible extension method 'GetDropAndCreateDdl' accepting a first argument of type 'AstTableNode' could be found
public static string GetDropAndCreateDDL(string connectionStringSource, string sourceTableName)
{
var sourceConnection = SchemaManager.CreateConnectionNode("Source", connectionStringSource);
var sourceImportResults = sourceConnection.ImportTableNodes(Nomenclature.Schema(sourceTableName),Nomenclature.Table(sourceTableName));
return sourceImportResults.TableNodes.ToList()[0].GetDropAndCreateDdl();
}
(Let's ignore the possibility of getting no table back or multiples for the sake of simplicity)
Looking at the varigence documentation, I don't see any reference to this method. This makes me think that there is a utility library that I am missing in my includes.
using Varigence.Biml.Extensions;
using Varigence.Biml.CoreLowerer.SchemaManagement;
What say you?
Joe
GetDropAndCreateDdl is an extension method in Varigence.Biml.Extensions.SchemaManagement.TableExtensions
ImportTableNodes returns an instance of
Varigence.Biml.CoreLowerer.SchemaManagement.ImportResults and the TableNodes is an IEnumerable of AstTableNodes
So, nothing weird there (like the table nodes in the import results being a different type)
I am not running into an issue if I have the code in-line with BimlExpress.
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<#
string connectionStringSource = #"Provider=SQLNCLI11;Data Source=localhost\dev2017;Integrated Security=SSPI;Initial Catalog=msdb";
var sourceConnection = SchemaManager.CreateConnectionNode("Source", connectionStringSource);
List<string> schemaList = new List<string>(){"dbo"};
var sourceImportResults = sourceConnection.ImportTableNodes("dbo", "");
WriteLine("<!-- {0} -->", sourceImportResults.TableNodes.Count());
//var sourceImportResults = sourceConnection.ImportTableNodes(schemaList,null);
var x = sourceImportResults.TableNodes.ToList()[0];
var ddl = x.GetDropAndCreateDdl();
WriteLine("<!-- {0} -->", sourceImportResults.TableNodes.FirstOrDefault().GetDropAndCreateDdl());
#>
</Biml>
The above code results in the following expanded Biml
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<!-- 221 -->
<!-- IF EXISTS (SELECT * from sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[autoadmin_backup_configuration_summary]') AND type IN (N'V'))
DROP VIEW [dbo].[autoadmin_backup_configuration_summary]
GO
CREATE VIEW [dbo].[autoadmin_backup_configuration_summary] AS
SELECT
ManagedBackupVersion,
IsAlwaysOn,
IsDropped,
IsEnabled,
RetentionPeriod,
EncryptionAlgorithm,
SchedulingOption,
DayOfWeek,
COUNT(*) AS DatabaseCount
FROM autoadmin_backup_configurations
GROUP BY
ManagedBackupVersion,
IsAlwaysOn,
IsDropped,
IsEnabled,
RetentionPeriod,
EncryptionAlgorithm,
SchedulingOption,
DayOfWeek
GO
-->
</Biml>
What do you use to get the checksum of a table in Laravel? Is there something already abstracted for this or you have to use raw commands?
You have to use raw commands, but it is pretty easy, just add this method to your model:
public static function checksum()
{
$tableName = with(new static)->getTable();
$query = sprintf('CHECKSUM TABLE %s', $tableName);
return \DB::select(\DB::raw($query))[0]->Checksum;
}
You can now call this method statically to get the checksum.
I want to import my IIS logs into SQL for reporting using Bulk Insert, but the comment lines - the ones that start with a # - cause a problem becasue those lines do not have the same number f fields as the data lines.
If I manually deleted the comments, I can perform a bulk insert.
Is there a way to perform a bulk insert while excluding lines based on a match such as : any line that beings with a "#".
Thanks.
The approach I generally use with BULK INSERT and irregular data is to push the incoming data into a temporary staging table with a single VARCHAR(MAX) column.
Once it's in there, I can use more flexible decision-making tools like SQL queries and string functions to decide which rows I want to select out of the staging table and bring into my main tables. This is also helpful because BULK INSERT can be maddeningly cryptic about the why and how of why it fails on a specific file.
The only other option I can think of is using pre-upload scripting to trim comments and other lines that don't fit your tabular criteria before you do your bulk insert.
I recommend using logparser.exe instead. LogParser has some pretty neat capabilities on its own, but it can also be used to format the IIS log to be properly imported by SQL Server.
Microsoft has a tool called "PrepWebLog" http://support.microsoft.com/kb/296093 - which strips-out these hash/pound characters, however I'm running it now (using a PowerShell script for multiple files) and am finding its performance intolerably slow.
I think it'd be faster if I wrote a C# program (or maybe even a macro).
Update: PrepWebLog just crashed on me. I'd avoid it.
Update #2, I looked at PowerShell's Get-Content and Set-Content commands but didn't like the syntax and possible performance. So I wrote this little C# console app:
if (args.Length == 2)
{
string path = args[0];
string outPath = args[1];
Regex hashString = new Regex("^#.+\r\n", RegexOptions.Multiline | RegexOptions.Compiled);
foreach (string file in Directory.GetFiles(path, "*.log"))
{
string data;
using (StreamReader sr = new StreamReader(file))
{
data = sr.ReadToEnd();
}
string output = hashString.Replace(data, string.Empty);
using (StreamWriter sw = new StreamWriter(Path.Combine(outPath, new FileInfo(file).Name), false))
{
sw.Write(output);
}
}
}
else
{
Console.WriteLine("Source and Destination Log Path required or too many arguments");
}
It's pretty quick.
Following up on what PeterX wrote, I modified the application to handle large log files since anything sufficiently large would create an out-of-memory exception. Also, since we're only interested in whether or not the first character of a line starts with a hash, we can just use StartsWith() method on the read operation.
class Program
{
static void Main(string[] args)
{
if (args.Length == 2)
{
string path = args[0];
string outPath = args[1];
string line;
foreach (string file in Directory.GetFiles(path, "*.log"))
{
using (StreamReader sr = new StreamReader(file))
{
using (StreamWriter sw = new StreamWriter(Path.Combine(outPath, new FileInfo(file).Name), false))
{
while ((line = sr.ReadLine()) != null)
{
if(!line.StartsWith("#"))
{
sw.WriteLine(line);
}
}
}
}
}
}
else
{
Console.WriteLine("Source and Destination Log Path required or too many arguments");
}
}
}
I've been having some issues using SchemaUpdate with MySQL.
I seem to have implemented everything correctly, but when I run it
it doesn't update anything. It doesn't generate any errors, and it
pauses for about the sort of length of time you would expect it to
take to inspect the DB schema, but it simply doesn't update anything,
and when I try to get it to script the change it just doesn't do
anything - it's as if it can;'t detect any changes to up the DB
schema, but I have created a new entity and a new mapping class - so I
cant see why it's not picking it up.
var config = Fluently.Configure()
.Database(() => {
var dbConfig = MySQLConfiguration.Standard.ConnectionString(
c => c.Server(configuration.Get<string>("server", ""))
.Database(configuration.Get<string>("database",""))
.Password(configuration.Get<string>("password", ""))
.Username(configuration.Get<string>("user", ""))
);
});
config.Mappings(
m => m.FluentMappings
.AddFromAssemblyOf<User>()
.AddFromAssemblyOf<UserMap>()
.Conventions.AddFromAssemblyOf<UserMap>()
.Conventions.AddFromAssemblyOf<PrimaryKeyIdConvention>()
// .PersistenceModel.Add(new CultureFilter())
);
var export = new SchemaUpdate(config);
export.Execute(false, true);
I don't think there's anything wrong with my config because it works
perfectly well with ShemaExport - it's just SchemaUpdate where I seem
to have a problem.
any ideas would be much appreciated!
Did you try to wrap SchemaUpdate execution in a transaction? There're some databases which need to run this in a transaction AFAIK.
using (var tx = session.BeginTransaction())
{
var tempFileName = Path.GetTempFileName();
try
{
using (var str = new StreamWriter(tempFileName))
{
new SchemaExport(configuration).Execute(showBuildScript, true, false, session.Connection, str);
}
}
finally
{
if (File.Exists(tempFileName))
{
File.Delete(tempFileName);
}
}
tx.Commit();
}
I figured it out:
The problem is that MySQL doesn't have multiple databases. It seems like some parts of MySQL and/or NHibernate use Schemas instead and SchemaUpdate seems to be one of them. So when I have
Database=A
in my connectionstring, and
<class ... schema="B">
in the mapping, then SchemaUpdate seems to think that this class is "for a different database" and doesn't update it.
The only fix I can think of right now would be to do a SchemaUpdate for every single schema (calling USE schema; first). But afaik, NHibernate has no interface to get a list of all schemas that are used in the mappings (correct me if I'm wrong). I'm afraid I have to iterate through the XML files manually (I use XML-based mappings) and collect them...