What is best way to read CSV data? - csv

I am using C# and trying to read a CSV by using this connection string;
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Documents and Settings\rajesh.yadava\Desktop\orcad;Extended Properties="Text;HDR=YES;IMEX=1;FMT=Delimited"
This works for tab delimited data.
I want a connection string which should for tab delimited as well as comma(,) and pipe(|).
How can I make a generic connection string for CSV.
Thanks
Rajesh

Is the filehelpers library an option?

I know this doesn't answer your questions, but here's a word of warning.
I've had to create my own reader as you don't get the correct drivers if you ever run on a 64 bit system.
If your software will ever run on a 64 bit system, make sure you test it first and that the oledb or odbc drivers will be present.

In case that you need a fast sequential access to the CSV file, the Fast CSV Reader could be an option. I have used it on a project some time ago with great success. It is supposed to be optimized quite well and also provides a cached version, if you need it. Additionally, it was updated several times since it was first released back in 2005 (last update in 2008-10-09) and it supports basic databinding by implementing System.Data.IDataReader.

Here's a few links from the net discussing this issue:
Manipulating CSV Files
ConnectionStrings.com
How To Open Delimited Text Files Using the Jet Provider's Text IIsam

Is using the TextFieldParser class an option?
http://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser.aspx

Without rolling a custom solution, I'm not sure there's a straightforward way to support more than one delimiter. This page suggests that through schema.ini you can choose between:
TabDelimited
CSVDelimited
one specific character (except double quote)
fixed width

class CSVFile extends SplFileObject
{
private $keys;
public function __construct($file)
{
parent::__construct($file);
$this->setFlags(SplFileObject::READ_CSV);
}
public function rewind()
{
parent::rewind();
$this->keys = parent::current();
parent::next();
}
public function current()
{
return array_combine($this->keys, parent::current());
}
public function getKeys()
{
return $this->keys;
}
}
then use with:
$csv = new CSVFile('exmaple.csv');
and you can iterate through lines using:
foreach ($csv as $line)
{

Related

Splitting file name in SSIS

I have files in one folder with following naming convention
ClientID_ClientName_Date_Fileextension
12345_Dell_20110103.CSV
I want to extract ClientID from the filename and store that in a variable. I am unsure how I would do this. It seems that a Script Task would suffice but I am do not know how to proceed.
Your options are using Expressions on SSIS Variables or using a Script Task. As a general rule, I prefer Expressions but mentally, I can tell that's a lot of code, or a lot of intertwined variables.
Instead, I'd use the String.Split method in .NET. If you called the Split method for your sample data and provided a delimiter of the underscore _ then you'd receive a 3 element array
12345
Dell
20110103.CSV
Wrap that in a Try Catch block and always grab the second element. Quick and dirty but of course won't address things like 12345_Dell_Quest_20110103.CSV but you didn't ask that question.
Code approximate
string phrase = Dts.Variables["User::CurrentFile"].Value.ToString()
string[] stringSeparators = new string[] {"-"};
string[] words;
try
{
words = phrase.Split(stringSeparators, StringSplitOptions.None);
Dts.Variables["User::ClientName"].Value = words[1];
}
catch
{
; // Do something with this error
}

Using RIO and Sqlite-net in MvvmCross

In the excellent mvvmcross-library I can use RIO binding to prevent unreadable code:
public INC<String>Title = new NC<String>();
Then I can read and write values using Title.Value. Makes the models much more readable.
Normally, this property would be written as:
private string _title;
public string Title
{
get { return _title; }
set
{
_title = value;
RaisePropertyChanged("Title");
}
}
But when I want to use sqlite-net, these fields cannot be streamed to the database because they are not basic types with a getter and setter.
I can think of a few options how to get around that:
Make a new simple object that is similar to the model, but only with
the direct db-fields. And create a simple import-export static
method on the model. This also could prevent struggling with complex
model-code that never needs to relate to the actual database.
Make sqlite-net understand reading NC-fields. I read into the code of the mapper, but it looks like this is going to be a lot of work because it relies on the getter-setter. I did not find a way to insert custom mapping to a type, that could be generic.
Remove RIO and just put in all the code myself instead of relying on RIO.
Maybe someone has some advice?
Thanks Stuart. It was exactly my thought, so I did implement it that way: my (DB) Models do not contain RIO. Only my viewmodels do, and they reference a Model that is DB-compatible.
So, for posterity the following tips:
- Do not use RIO in your models that need to be database-backed.
- Reference models in your viewmodels. In the binding you can use the . (dot) to reference this model.
This keeps them nicely separated. This gives you also another advantage: if you need to reuse a model (because the same object might be displayed twice on the screen), but under different circumstances, it is much easier to handle this situaties to find this already instantiated model.

Extending CodeIgniter Security.php to enable logging

TLDR; I want to enable database-logging of xss_clean() when replacing evil data.
I want to enable database logging of the xss_clean() function in Security.php, basically what I want to do is to know if the input I'm feeding xss_clean() with successfully was identified to have malicious data in it that was filtered out or not.
So basically:
$str = '<script>alert();</script>';
$str = xss_clean($str);
What would happen ideally for me is:
Clean the string from XSS
Return the clean $str
Input information about the evil data (and eventually the logged in user) to the database
As far as I can see in the Security.php-file there is nothing that takes care of this for me, or something that COULD do so by hooks etc. I might be mistaken of course.
Since no logging of how many replaces that were made in Security.php - am I forced to extend Security.php, copy pasting the current code in the original function and altering it to support this? Or is there a solution that is more clean and safe for future updates of CodeIgniter (and especially the files being tampered/extended with)?
You would need to extend the Security class, but there is absolutely no need to copy and paste any code if all you need is a log of the input/output. Something along the lines of the following would allow you to do so:
Class My_Security extends CI_Security {
public function xss_clean($str, $is_image = FALSE) {
// Do whatever you need here with the input ... ($str, $is_image)
$str = parent::xss_clean($str, $is_image);
// Do whatever you need here with the output ... ($str)
return $str;
}
}
That way, you are just wrapping the existing function and messing with the input/output. You could be more forward compatible by using the PHP function get_args to transparently pass around the arguments object, if you were concerned about changes to the underlying method.

Deserialization problem: Error when deserializing from a different program version

I finally decided myself to post my problem, after a couple of hours spent searching the Internet for solutions and trying some.
[Problem Context]
I am developing an application which will be deployed in two parts:
an XML Importer tool: its role is to Load/Read an xml file in order to fill some datastructures, which are afterwards serialized into a binary file.
the end user application: it will Load the binary file generated by the XML Importer and do some stuff with the recovered data structures.
For now, I only use the XML Importer for both purposes (meaning I first load the xml and save it to a binary file, then I reopen the XML Importer and load my binary file).
[Actual Problem]
This works just fine and I am able to recover all the data I had after XML loading, as long as I do that with the same build of my XML Importer. This is not viable, as I will need at the very least two different builds, one for the XML Importer and one for the end user application. Please note that the two versions of the XML Importer I use for my testing are exactly the same concerning the source code and thus the datastructures, the only difference lies in the build number (to force a different build I just add a space somewhere and build again).
So what I'm trying to do is:
Build a version of my XML Importer
Open the XML Importer, load an XML file and save the resulting datastructures to a binary file
Rebuild the XML Importer
Open the XML Importer newly built, load the previously created binary file and recover my datastructures.
At this time, I get an Exception:
SerializationException: Could not find type 'System.Collections.Generic.List`1[[Grid, 74b7fa2fcc11e47f8bc966e9110610a6, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null]]'.
System.Runtime.Serialization.Formatters.Binary.ObjectReader.ReadType (System.IO.BinaryReader reader, TypeTag code)
System.Runtime.Serialization.Formatters.Binary.ObjectReader.ReadTypeMetadata (System.IO.BinaryReader reader, Boolean isRuntimeObject, Boolean hasTypeInfo)
System.Runtime.Serialization.Formatters.Binary.ObjectReader.ReadObjectInstance (System.IO.BinaryReader reader, Boolean isRuntimeObject, Boolean hasTypeInfo, System.Int64& objectId, System.Object& value, System.Runtime.Serialization.SerializationInfo& info)
System.Runtime.Serialization.Formatters.Binary.ObjectReader.ReadObject (BinaryElement element, System.IO.BinaryReader reader, System.Int64& objectId, System.Object& value, System.Runtime.Serialization.SerializationInfo& info)
For your information (don't know if useful or not), the actual type it is struggling to deserialize is a List, Grid being a custom Class (which is correctly serializable, as I am able to do it when using the same version of XML Importer).
[Potential Solution]
I do believe it comes from somewhere around the Assembly, as I read many posts and articles about this. However, I already have a custom Binder taking care of the differences of Assembly names, looking like this:
public sealed class VersionDeserializationBinder : SerializationBinder
{
public override Type BindToType( string assemblyName, string typeName )
{
if ( !string.IsNullOrEmpty( assemblyName ) && !string.IsNullOrEmpty( typeName ) )
{
Type typeToDeserialize = null;
assemblyName = Assembly.GetExecutingAssembly().FullName;
// The following line of code returns the type.
typeToDeserialize = Type.GetType( String.Format( "{0}, {1}", typeName, assemblyName ) );
return typeToDeserialize;
}
return null;
}
}
which I assign to the BinaryFormatter before deserializing here:
public static SaveData Load (string filePath)
{
SaveData data = null;//new SaveData ();
Stream stream;
stream = File.Open(filePath, FileMode.Open);
BinaryFormatter bformatter = new BinaryFormatter();
bformatter.Binder = new VersionDeserializationBinder();
data = (SaveData)bformatter.Deserialize(stream);
stream.Close();
Debug.Log("Binary version loaded from " + filePath);
return data;
}
Do any of you guys have an idea on how I could fix it? Would be awesome, pretty please :)
Move the working bits to a separate assembly and use the assembly in both "server" and "client". Based on your explanation of the problem, this should get around the "wrong version" problem, if that is the core issue. I would also take any "models" (i.e. bits of state like Grid) to a domain model project, and use that in both places.
I just bumped into your thread while I had the same problem. Especially your code sample with the SerializationBinder helped me a lot. I just had to modify it slightly to tell a difference between my own assemblies and those of Microsoft. Hopefully it still helps you, too:
sealed class VersionDeserializationBinder : SerializationBinder
{
public override Type BindToType(string assemblyName, string typeName)
{
Type typeToDeserialize = null;
string currentAssemblyInfo = Assembly.GetExecutingAssembly().FullName;
//my modification
string currentAssemblyName = currentAssemblyInfo.Split(',')[0];
if (assemblyName.StartsWith(currentAssemblyName))assemblyName = currentAssemblyInfo;
typeToDeserialize = Type.GetType(string.Format("{0}, {1}", typeName, assemblyName));
return typeToDeserialize;
}
}
I believe the problem is that you are telling it to look for List<> in the executing assembly, whereas in fact it is in the System assembly. You should only re-assign the assembly name in your binder if the original assembly is one of yours.
Also, you might have to handle the parameter types for generics specifically in the binder, by parsing out the type name and making sure the parameter types are not specific to the foreign assembly when you return the parameterized generic type.

How to Replace content inside custom tag in MediaWiki before saving to database?

First of all, I have both MW 1.16 and 1.17 set up with PHP 5.3.5, MySQL 5.5.8 and Apache 2.2.17.
I've wrote a simple $wgExtensionFunction which right now does nothing.
$wgExtensionFunctions[] = "wfTestExtension";
function wfTestExtension() { global $wgParser;
$wgParser->setHook("myTag", "renderTest");
}
function renderTest($input) {
return $input;
}
What I want to do is, if I type <myTag>Blah blah blah</myTag> in the add or edit form, I want to be able to change the contents inside myTag BEFORE saving it to the database. What mechanism should I use for this? I'm assuming hooks? For example, with the ArticleSave hook, the $text var already has the <myTag> stripped out, so there's no way of trying to parse the string and figuring out what it is that was originally inside the <myTag>
I've spent hours trying to find something on Google, but I've almost given up. Any advice at all would be highly appreciated.
Cheers.
Maybe http://www.mediawiki.org/wiki/Manual:Hooks/ParserBeforeStrip would work:
"Used to process the raw wiki code before any internal processing is applied"
This is another "maybe" but you could try using a combination of a template and the {{subst:}} command (see transclusion).
(See also Wikis and Wikipedia)