Adding Data Import Handlers to ColdFusion 2016 Solr

Adding Data Import Handlers to ColdFusion 2016 Solr - mysql

I am trying to index tables from our MySQL database on the Solr version that came with our ACF 2016 installation. Adobe's docs state that I need to use Solr's Data Import Handler to do this, which they say entails modifying solrconfig.xml and creating a data-config.xml file.
This does not work as is. Further reading leads me to believe that I need to:
Download a solr-dataimporthandler.jar
Copy ColdFusion's MySQL connector so that Solr can use it
Edit solrconfig.xml to account for these changes
Create a data-config.xml
Am I correct so far? Because I've been trying that, and when I try to reload my collection, the CF administrator gives me an error "Error handling 'reload' action." The Solr admin itself says:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'.
I don't know much about this, but it sounds to me like Solr cannot load one or both of the two jar files I added. If this is the case, then:
What solrconfig.xml should I be editing? I've been working on the one in the specific collection I've set up to index our database to.
Do I need both of those jars? Where should I put them? I have tried them in C:\ColdFusion2016\cfusion\jetty\lib and in a custom lib I set up at C:\ColdFusion2016\cfusion\jetty\multicore\lib.
Some sources (not Adobe) say I need to add lib directives to solrconfig.xml, while others say that any jars in a lib in Solr's "root" directory will automatically get added. I've tried both ways, and get the errors described above.
Still other sources say I need to add them to my classpath. I am hesitant to do this on our server if we do not need to.
I know this question is all over the place, but I have gotten myself quite confused and I would really appreciate any help or pushes in the right direction. My hope is that I am just making some dumb mistakes somewhere, because I don't think it should be this complicated!
Note that Solr itself is running fine and some collections I have set up that index directories of PDFs are working, no troubles. None of the solrconfig.xml files in the other collections have any request handlers or libs referring to data import handlers.
Thanks in advance! I appreciate your reading all of this! :-)

Okay so I finally got it working. As suspected, the root issue was Solr not locating the MySQL connector and data import handler jars. For the benefit of any others who might stumble across this, here is what I did. We are using Adobe ColdFusion 2016 and Solr 5.2.1 that shipped with ACF.
You do need the MySQL connector and data import handler jars. I used a version-matched data import handler called solr-dataimporthandler-5.2.1.jar and I downloaded it from here. Make sure you select the jar file in the "files" section to start the download. For the MySQL connector, I just copied the one that came from ColdFusion. Mine was called mysql-connector-java-5.1.38-bin.jar, and for my CF install, it was located at C:\ColdFusion2016\cfusion\lib.
I learned that Solr will automatically look for jars if they are in a particular place. No need for lib directives or any file editing. For me, I created a folder called "lib" in my Solr instance which is at C:\ColdFusion2016\cfusion\jetty\multicore and I put the jars in there. So the full path to the new jars is C:\ColdFusion2016\cfusion\jetty\multicore\lib, but you do not need to edit any file to account for that.
You do need to edit solrconfig.xml to account for the data import handler. For me, the only way I could get this to work was to edit solrconfig.xml for each collection. Editing any of the various other versions would not work. So for my collection called "dmfile," which I had previously created in the CF Admin, the solrconfig.xml to edit was at C:\ColdFusion2016\cfusion\jetty\multicore\collections\vfs_dmfile\conf. I added the following to the file in the section where the other request handlers were:
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>
That's all I needed to do for solrconfig.xml.
In that same directory, create a data-config.xml file. Here's mine:
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/myDatabase"
user="myUsername"
password="myPassword"/>\
<document>
<entity name="dmfile" query="SELECT filename, ObjectID from dmfile WHERE status = 'approved'">
<field column="filename" name="filename" />
<field column="ObjectID" name="uid" />
<field column="status" name="dmfile_status" />
</entity>
</document>
To get started, I tried to keep things simple. Note how the entity name matches the name of the collection. I matched the unique ID from our database table (ObjectID) to the standard unique ID field that Solr has (uid). "Column" is the column from our database and "name" is whatever name I want Solr to use. Your database of course will likely be different.
Finally I edited schema.xml, also in the same directory:
<field name="filename" type="string" indexed="true" stored="true" required="false" />
<field name="dmfile_status" type="string" indexed="true" stored="true" required="false" />
The "name" attribute needs to match whatever you set in data-config.xml. Note that I did not add a field for uid -- it was already in schema.xml by default.
I am on a Windows server, so I went to services.msc and restarted the ColdFusion 2016 Add-On Services service. NOTE: restarting ColdFusion itself did not work for me. I needed to restart ColdFusion 2016 Add-On Services, and only that.
Finally I could reload my collection, and (more importantly) I could browse the core in the Solr admin at http://localhost:8989/solr/#/. I could select my dmfile core in the "Core Selector" dropdown, and was able to choose the DataImportHandler without getting an error.
That is how I got it to work for me. I found that I needed to repeat steps 3-6 for every core that I wished to connect to MySQL. Some documentation states that you can do at least Step 3 at a global level, but that did not work for me at all.
Anyways, it took me quite a while to figure all of that out, so hopefully this will help any other CFers out there who were stuck like I was.

Related

ASP.NET Ignore exception for potentially dangerous request

I've got an ASP.net webpage that allows the user to change their password. Currently, when a user creates a new password in the form field that ends with the "# sign" it throws the server error stating that a potentially dangerous request.form value was detected from the client. I did some research and found that it's because it thinks it's a MySQL comment that is trying to ignore all code after the "# sign." For example, a user can enter a password such as ABC#& but not ABC&#.
Is there anything I can put in my code on my asp.net page, such as in the header line, etc. to prevent the server from thinking this is a SQL injection? All SQL in my code behind uses parameters so I am not worried about SQL injections at the moment. I just want every user to be able to have any combination of characters possible in their passwords. Thanks for any insight!

Check out this article from Microsoft if this is something you really want to achieve: https://msdn.microsoft.com/en-us/library/hh882339(v=vs.110).aspx
The article mentions you may have to modify your web.config to include the following settings:
<system.web>
<httpRuntime requestValidationMode="2.0" />
</system.web>
and
<configuration>
<system.web>
<pages validateRequest="false" />
</system.web>
</configuration>
Or for an individual page:
<# Page validateRequest="false" %>
Note that these configurations may vary depending on the type of .Net project you are working on. The article I mentioned has more information. I strictly referenced that source.

Creating a new cube with Saiku plugin on Pentaho

I have installed Pentaho Community Edition 5.0.1 and successfully created a new data source from a MySQL database (both hosted on my PC).I now want to perform OLAP analysis on this data, but am not entirely sure how to proceed (So please provide additional information in the case where it is clear from my questions that I am heading on a wrong path - I am new to OLAP).
The Saiku plugin is installed and I can view the 2 premade cubes (SampleData and SteelWheels). I now want to create one or more cubes referencing the data in the data source that I had previously created, which is where I am stuck. As I understand it, I need to create a Mondrian schema which defines the cube, going by this I created the shema for the data source as defined on the same tutorial as
<Schema name="testdb">
<Cube name="Test Cube">
<Table name="testtable">
</Table>
<Dimension name="Date">
<Hierarchy hasAll="true">
<Level name="Date" column="date" type="Integer"/>
</Hierarchy>
</Dimension>
<Dimension name="Key 1">
<Hierarchy hasAll="true">
<Level name="Key 1" column="key1" type="String"/>
</Hierarchy>
</Dimension>
<Dimension name="Key 2">
<Hierarchy hasAll="true">
<Level name="Key 2" column="key2" type="String"/>
</Hierarchy>
</Dimension>
<Measure name="Value" column="value" aggregator="sum"/>
The tutorial states that this file can be placed anywhere (I', assuming in the biserver-ce folder or sub folders - any best practices on location?). When refreshing the cubes (by pressing the green arrows icon) in Saiku - big surprise - the new cube is not listed, only SampleData and SteelWheels are options in the dropdown (Pentaho had been restarted etc, no effect). When inspecting the created schema, there is no reference to the created data source, so I have no idea how it can be linked or used by Saiku/Pentaho. This is where I think the problem lies, I need to register this file somehow. I have seen references to a data source definition file (like here) which seems to be what I need to do. I cant however find where this file should be placed, what it should be named or any tutorial incorporating such a step. I also find it strange that one has to break out of the usage flow of the Pentaho application to make external files which are needed for following steps - hints that I am doing things wrong.
In summary: How to create OLAP cube using Pentaho CE and Saiku from a working data source?
It has been surprisingly difficult to find well documented help on usage of Pentaho CE (with Saiku) and warnings on the numerous issues in the setup, I think verbose answers on this question will be of help to the community

For launching the cube into BI-Server you simply has to create 1 cube in pentaho schema work-bench and you have to save the link into what ever location you want to store or create one folder and save it their..
After that you have to publish the cube..
Now if you added the saiku pluggin into bi-server then this newly created cube will list under the steelwheles..
So this is the only procedure you have to follow.. now compare what you did by comparing this steps..
Some-times by restarting biserver this problem will not going to solve so, just refresh the list of cubes (the portion where SampleData and Steelwheels ) are displaying so after that you can see the newly created cube in that list..

Azure Worker Role configuration issue while using SlowCheetah with custom config

We are using Nlog as logging tool with our Worker Role of Azure app.
It requires NLog.config file. We installed "SlowCheetah - XML Transforms", and have two Debug/Release transforms).
Solution does get rebuild successfully.
But when I try to run, I am getting following error. (I used exact transformation for nolog.config in one of my Windows service app, and it is working fine there).
Error 163 The item "bin\Debug\NLog.config" in item list "OutputGroups"
does not define a value for metadata "TargetPath". In order to use
this metadata, either qualify it by specifying
%(OutputGroups.TargetPath), or ensure that all items in this list
define a value for this metadata. C:\Program Files
(x86)\MSBuild\Microsoft\VisualStudio\v10.0\Windows Azure
Tools\1.6\Microsoft.WindowsAzure.targets 2299 5 Insight.CloudWeb

I don't know if this is done by the SlowCheetah extension, but could you verify if your *.csproj file contains the AfterCompile target similar to this?
<Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />
<UsingTask TaskName="TransformXml"
AssemblyFile="$(MSBuildExtensionsPath32)\Microsoft\VisualStudio\v10.0\Web\Microsoft.Web.Publishing.Tasks.dll" />
<Target Name="AfterCompile" Condition="exists(’app.$(Configuration).config’)">
<TransformXml Source="NLog.config"
Destination="$(IntermediateOutputPath)$(TargetFileName).config"
Transform="NLog.$(Configuration).config" />
<ItemGroup>
<AppConfigWithTargetPath Remove="NLog.config"/>
<AppConfigWithTargetPath Include="$(IntermediateOutputPath)$(TargetFileName).config">
<TargetPath>$(TargetFileName).config</TargetPath>
</AppConfigWithTargetPath>
</ItemGroup>
</Target>
Take a look at Oleg's blog post .Config File Transformation under App.config File Transformation for more information.

I have a fix for this. Now you should be able to transform app.config as well as other XML files for Azure Worker Roles using SlowCheetah. Once I get the fix verified I will release the update to the VS gallery.
If you would like to try the fix you can download the updated VSIX at https://dl.dropbox.com/u/40134810/SlowCheetah/issue-44/SlowCheetah-issue-44.zip. If you are interested in following up on this please use the issue #44.

Using a shared data source for dynamically generated and deployed reports

I'm dynamically generating RDL files for SSRS 2008, assembling my reports from "building blocks" which I defined as reports on Report Server, and which I use as subreports on my generated report.
On my Report Server, I have a single, shared data source which does work as long as I run stuff directly on the report server.
What I'm trying to accomplish is this:
my generated main report should reference that shared data source
my subreports contained on the generated main report should also use the same data source
after I deploy the report to report server using the webservice interface, I'd like to be able to actually see the report right away
For now, I can generate and validate my RDL just fine, I can deploy it to the report server just fine, too - it shows up and all, great.
But when I try to view the report, I get an error that my data source is invalid or has been removed or something.......
What am I missing?? I am pretty sure I have the right data source - GUID for it and all - and the names do match. How do I tell a generated RDL to use the shared data source already present on the server??

Answering my own question here, hoping someone else might find this useful:
I was under the (false) impression that the unique "DataSourceID" given to a data source on the server would be sufficient to identify it uniquely.
So in my generated RDL, I had something like :
<DataSources>
<DataSource Name="MyDataSource">
<Transaction>true</Transaction>
<DataSourceReference>MyDataSource</DataSourceReference>
<rd:DataSourceID>6ba7c588-e270-4de9-988c-d2af024f10e1</rd:DataSourceID>
<rd:SecurityType>None</rd:SecurityType>
</DataSource>
</DataSources>
Now this worked once, when my data source was indeed called "MyDataSource" and located in the same directory as my report which I published through the RS WebService API.
As soon as I moved the data source elsewhere, it stopped working.
THE SOLUTION:
This may sound silly, but I really didn't "get it" at first: the DataSourceReference needs to have the full and complete "path" on the Reporting Server to that data source I want to reference. Just specifying the unique ID won't do....
So once I changed my RDL to:
<DataSources>
<DataSource Name="MyDataSource">
<Transaction>true</Transaction>
<DataSourceReference>/MyProject/DataSources/MyDataSource</DataSourceReference>
<rd:DataSourceID>6ba7c588-e270-4de9-988c-d2af024f10e1</rd:DataSourceID>
<rd:SecurityType>None</rd:SecurityType>
</DataSource>
</DataSources>
(notice the <DataSourceReference>/MyProject/DataSources/MyDataSource</DataSourceReference>)
since that moment it works like a charm.
Hope someone might find this useful some day!

Web.Config file and Linq to Sql changing its place

I'm having a strange issue with my project. It was a Web Site that is now converted to a Web Application that is in a solution. Initially classes were setup using Linq to Sql .dbml file, which stored its connection string in /MyProject/web.config. Now the project ('Web Application') is in a solution and when I modify the Linq to Sql dbml file it creates a web.config file with only its connection string one level above, in /MySolution/web.config, while I still have /MySolution/MyProject/web.config. That gives errors with duplicate connection string names. So, how can I have Linq to Sql just use the web.config file in /MySolution/MyProject/web.config, or is my entire web.config file supposed to be in MySolution/web.config (I would prefer to keep it where it is) Thanks!
PS: the datacontext is in /MySolution/MyProject/MyCode/Models/MyDataContext.dbml

It appears as though updating the dbml will always force only the root web.config to be updated. It will likely be easiest to maintain your project if you do only use that root web.config, but you do have another option.
Each folder can have its own config, which is why you're getting the duplicate name exception. If you want to get around this, you can first remove then add a connection string with the same name. If you do this, your connectionStrings block (within /MySolution/MyProject/web.config) will look similar to the following:
<connectionStrings>
<remove name="MyConnectionString"/>
<add name="MyConnectionString" connectionString="XXXXXXXXXX"
providerName="System.Data.SqlClient" />
</connectionStrings>
Like I said, I can't really recommend that you do this, as your dbml will still save to the root web.config, so it might not be easy for other developers to realize what is going on.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008