SQLite3: Failing to import CSV on Mac - csv

I'm a newbie to sqlite3 and I'm trying to play around with some data that I got my hands on.
I'm currently running into the problem in which I cannot even create my table in my shell. I'm trying to run
.mode csv
.import filename tab1
But I'm getting this error
CREATE TABLE tab1(...) failed: duplicate column name: B-
Since I'm importing a .csv file, I thought that perhaps I do have a duplicate in my column names but they all seem to be distinct to me. Can someone point me in the right direction? I would really appreciate it :-)
Here is a short excerpt of my .csv file:
SID,Name,Cohort,Email,Gender,Ethnicity,Major,Grades,More grades,Last grades,Not sure,,
24239361,name1,Cohort 1-Fall 2013,name1#email.com,Female,Chinese,CS,B,C+,B-,B-,,
24474707,name2,Cohort 1-Fall 2013,name2#email.com,Male,Chinese,CS,B,B+,B-,B-,,
24266062,name3,Cohort 1-Fall 2013,name3#email.com,Male,White,CS,B-,B-,C,B ,,
UPDATE:
Edited my csv file to now only be:
SID,Name,Cohort
24239361,Name1,Cohort 1-Fall 2013
24474707,Name2,Cohort 1-Fall 2013
24266062,Name3,Cohort 1-Fall 2013
22181134,Name4,Cohort 1-Fall 2013
And when I import it and do .schema, I get this:
CREATE TABLE foo(
"SID" TEXT,
"Name" TEXT,
24239361" TEXT,
"Name1" TEXT,
24474707" TEXT,l 2013
"Name2" TEXT,
24266062" TEXT,l 2013
"Name3" TEXT,
22181134" TEXT,l 2013
"Name4" TEXT,
24527147" TEXT,l 2013
This is really strange because I'm skipping over the header column "Cohort" and instead reading all of my following lines to columns

I can't reproduce your issue exactly, but when I try with your sample data I get the following not-quite-identical error:
CREATE TABLE tab1(...) failed: duplicate column name:
And the reason for that is those two commas at the end of your first line (representing the columns of the new table). SQLite tries to make two columns with blank names and fails on the second. The solution is to either remove those commas from every line of the file, or to give those fields valid column names.
I wasn't able to specifically reproduce your issue, but it looks like SQLite is not seeing that first line for some reason or another and is trying to set the columns up based on one of the next lines (which have multiple fields with value B-, causing the same issue as above). You'll need to track down why that's happening. Alternatively, you can create the table first, and remove the column heading line from the file before you import it:
CREATE TABLE tab1 (SID INTEGER, Name TEXT, ...);
.mode csv
.import filename tab1

The sqlite3 tool expects the new-line characters in CSV files to be CR+LF (as specified in RFC 4180).
Your file looks like this:
00000000: 53 49 44 2c 4e 61 6d 65 2c 43 6f 68 6f 72 74 0d SID,Name,Cohort.
00000010: 32 34 32 33 39 33 36 31 2c 41 6c 6c 69 73 6f 6e 24239361,Xxxxxxx
00000020: 20 43 6f 72 69 6e 6e 65 20 59 65 65 2c 43 6f 68 Xxxxxxx Xxx,Coh
00000030: 6f 72 74 20 31 2d 46 61 6c 6c 20 32 30 31 33 0d ort 1-Fall 2013.
00000040: 32 34 34 37 34 37 30 37 2c 41 6e 74 68 6f 6e 79 24474707,Xxxxxxx
...
This file has Mac line endings (only CR), which would be valid for a normal text file.
You can manually change the row separator after setting CSV mode:
.mode csv
.sep , \r

Related

What is the excess-65 exponent format?

According to the IBM Informix docs:
DECIMAL(p, s) values are stored internally with the first byte representing a sign bit and a 7-bit exponent in excess-65 format.
How does the "excess-65" format work?
References
DECIMAL(p,s) Data Types
DECIMAL Storage
The notation is specific to Informix and its DECIMAL and MONEY types — AFAIK, no other product uses it. Informix also uses it within its DATETIME and INTERVAL types, but that's an implementation detail for the most part.
I've always know the on-disk form as 'excess-64' rather than 'excess-65'; I'm not sure which is correct, but I think 64 has a solid basis.
The 'excess-6n' form is used for disk storage. It has the benefit that two decimal values in the disk format can be compared using memcmp() to get a correct comparison (though NULL values have to be handled separately — NULL values always cause pain and grief).
The decimal.h header from ESQL/C (and C-ISAM) contains the information:
/*
* Packed Format (format in records in files)
*
* First byte =
* top 1 bit = sign 0=neg, 1=pos
* low 7 bits = Exponent in excess 64 format
* Rest of bytes = base 100 digits in 100 complement format
* Notes -- This format sorts numerically with just a
* simple byte by byte unsigned comparison.
* Zero is represented as 80,00,00,... (hex).
* Negative numbers have the exponent complemented
* and the base 100 digits in 100's complement
*/
Note the mention of 64 rather than 65. Also note that 'decimal' is in some respects a misnomer; the data is represented using a 'centesimal' (base-100) notation.
Here are some sample values, decimal representation and then bytes for the on-disk format. Note that to some extent, the number of bytes is arbitrary. If using something like DECIMAL(16,4), there will be 1 byte sign and exponent and 8 bytes of data (and the range of exponents will be limited). If you use DECIMAL(16) — for floating point — then the range of exponents is much less limited.
Decimal value Byte representation (hex)
0 80 00 00 00 00
1 C1 01
-1 3E 63
9.9 C1 09 5A 00
-9.9 3E 5A 0A 00
99.99 C1 63 63 00 00 00
-99.99 3E 00 01 00 00 00
999.999 C2 09 63 63 5A
-999.999 3D 5A 00 00 0A
0.1 C0 0A 00 00
-0.1 3F 5A 00 00
0.00012345 BF 01 17 2D 00
-0.00012345 40 62 4C 37 00
1.2345678901234e-09 BC 0C 22 38 4E 5A 0C 22
-1.2345678901234e-09 43 57 41 2B 15 09 57 42
1.2345678901234e+09 C5 0C 22 38 4E 5A 0C 22
-1.2345678901234e+09 3A 57 41 2B 15 09 57 42
And so on.

SSIS Flat File Export - Only need certain rows from file

I am using SQL Server 2012 and am trying to import a flat file and store it into the database. The problem I am having is that I only need certain rows in a file that contains much more data.
Here is an example from part of a file which I am trying to import.
12/02/2015 09:47:44:917 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3 Status: 0x00
TranId: 6, Data ByteCount: 55
Data: 86 A6 4E 0B 6A 64 54 2E 00 50 00 02 00 00 60 1A E0 AD 10 12 BF 07 56 54 20 31 32 42 46 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 CB
Rx Cmd=84, Rsp code=0x00, Device Status=0x50
Sub-Device Index=2
IO Card=0
Channel=0
Manufacturer ID=24602
The only values I care about are ones which are delimited by = (Sub-Device Index, IO Card, etc.). In this example, how can I import the file in a way that the value 2 gets inserted into a column for Sub-Device Index, 0 inserted into a column for IO Card, 0 for Channel, etc.?
You can do this with a script task that reads and parses the file, and either populates SSIS variables for use with an Execute SQL task, or the script task can insert them directly into the database.

Mysql search for exact number in field

in my MySql DB, i have column containing that kind of value
63 61 57 52 50 47 46 44 43 34 33 27 23 22 21 10 5 3 2 1
Those numbers are separated by tab.
Impossible to get the good result for a simple query that would aid something like this
SELECT * FROM mytable WHERE mycolumn = 63
I'm not sure if "=" is the good method, i've also tried LIKE, IN and even FIND_IN_SET
I need some help :)

Automatically break comments after a certain amount of characters

Is it possible to highlight a large comment block and automatically insert breaks after a certain length?
Simple Example
# This is a super long message that has too much information in it. Although inline comments are cool, this sentence should not be this long.
After
# This is a super long message that has too much information in it.
# Although inline comments are cool, this sentence should not be this
# long.
Yes! And it's simpler than the length of this answer might suggest!
Background
There's a command called wrap_lines which has the capability to to exactly what you want. wrap_lines is defined in Packages/Default/paragraph.py, line 112, and is sparsely documented at the Unofficial Docs' list of commands:
wrap_lines
Wraps lines. By default, it wraps lines at the first ruler’s column.
width [Int]: Specifies the column at which lines should be wrapped.
wrap_lines is already accessible via the items located in the Edit -> Wrap menu. There are options to wrap the paragraph in which the cursor lies at column 70, 78, 80, 100, and 120, as well as the first ruler. By default, Wrap Paragraph at Ruler is mapped to Alt+Q on Windows and Super+Alt+Q on OSX.
Working with Rulers
What do I mean by the first ruler? You might summon a ruler via View -> Ruler, but if you want more than one on-screen (or would rather have your rulers in writing), you can add a JSON array of integers—each of which defines a ruler—to any .sublime-settings file. The array I've added to my user preferences file, for example, looks like this:
"rulers":
[
79,
80,
72
],
Thanks to that first ruler rule, Alt+Q will wrap a line that's longer than 79 characters at the 79-column mark, even though there's a ruler "before" it at column 72. "First ruler" doesn't mean the leftmost ruler, but the first defined ruler. If I moved 80, to index 0 like I have below, then the lines would wrap at 80 columns instead. Likewise for 72.
"rulers":
[
80,
79,
72
],
Using a Key Binding
Rulers are for the weak, you say? You can also write a new key binding to wrap a line at a column of your choice! Just add something like this to your Preferences -> Key Bindings – User file:
{ "keys": ["alt+q"], "command": "wrap_lines", "args": {"width": 80} },
Removing the args object would instead wrap at the first ruler, like the Wrap Paragraph at Ruler command does. Wrap Paragraph at Ruler is actually defined just like that in the default Windows keymap file:
{ "keys": ["alt+q"], "command": "wrap_lines" },
Caveats
One of the best (and, in some cases, worst) things about the wrap_lines command is that it will detect any sequence of non-alphanumeric characters that begins the line and duplicate it when wrapping. It's great for writing comments, since the behavior that your example suggested does indeed happen:
# This is a super long message that has too much information in it. Although inline comments are cool, this sentence should not be this long, because having to scroll to the right to finish reading a comment is really annoying!
Becomes:
# This is a super long message that has too much information in it.
# Although inline comments are cool, this sentence should not be this
# long, because having to scroll to the right to finish reading a
# comment is really annoying!
But if the line happens to start with any other symbols, like the beginning of a quote, Sublime Text doesn't know any better than to wrap those, too. So this:
# "This is a super long message that has too much information in it. Although inline comments are cool, this sentence should not be this long, because having to scroll to the right to finish reading a comment is really annoying!"
Becomes this, which we probably don't want:
# "This is a super long message that has too much information in it.
# "Although inline comments are cool, this sentence should not be this
# "long, because having to scroll to the right to finish reading a
# "comment is really annoying!"
It's a good idea to be careful with what starts your original line. Also, the wrap_lines command targets the entire paragraph that your cursor is touching, not just the current line nor, surprisingly, only the working selection. This means that you can use the command again on a newly-wrapped series of lines to re-wrap them at a different column, but you might also end up wrapping some lines that you didn't want to—like if you're aligning a paragraph under a header in Markdown:
# Hey, this header isn't really there!
Be careful with what starts the original line. Also, the `wrap_lines` command **targets the entire paragraph**, not just the current line.
If the command is activated anywhere within that block of text, you'll get this:
# Hey, this header isn't really there! Be careful with what starts the
original line. Also, the `wrap_lines` command **targets the entire
paragraph**, not just the current line.
You can avoid this sort of issue with clever use of whitespace; another empty line between the header and the paragraph itself will fix the wrapping, so:
# Hey, this header isn't really there!
Be careful with what starts the original line. Also, the `wrap_lines` command **targets the entire paragraph**, not just the current line.
Becomes:
# Hey, this header isn't really there!
Be careful with what starts the original line. Also, the
`wrap_lines` command **targets the entire paragraph**, not just
the current line.
Since the operation is so fast, you shouldn't have much trouble with tracking down the causes of any errors you encounter, starting over, and creatively avoiding them. Sublime Text usually doesn't have any trouble with mixing together comments and non-comments in code, so if you're lucky, you won't ever have to worry about it!
In SublimeText3 (and any earlier versions w/regular expression support, i.e. all of them):
Here is the search/replace that inserts a newline every 48 characters:
Find:
(.{48}){1}
Replace:
\1\n
Explanation:
The parentheses form a group so that replace can reference matches
with \1.
The . matches any character, and the {n} matches
exactly n of them.
The replace command takes each match group and
back-substitutes it with a newline \n appended.
Note: \1 technically refers to only the first match found, but using "replace all" handles the remaining regex matches as well).
Real-world example:
Suppose you are formatting public keys, which copied directly from a browser without the header/footer provides the following for the Google Internet Authority:
9c 2a 04 77 5c d8 50 91 3a 06 a3 82 e0 d8 50 48 bc 89 3f f1 19 70 1a 88 46 7e e0 8f c5 f1 89 ce 21 ee 5a fe 61 0d b7 32 44 89 a0 74 0b 53 4f 55 a4 ce 82 62 95 ee eb 59 5f c6 e1 05 80 12 c4 5e 94 3f bc 5b 48 38 f4 53 f7 24 e6 fb 91 e9 15 c4 cf f4 53 0d f4 4a fc 9f 54 de 7d be a0 6b 6f 87 c0 d0 50 1f 28 30 03 40 da 08 73 51 6c 7f ff 3a 3c a7 37 06 8e bd 4b 11 04 eb 7d 24 de e6 f9 fc 31 71 fb 94 d5 60 f3 2e 4a af 42 d2 cb ea c4 6a 1a b2 cc 53 dd 15 4b 8b 1f c8 19 61 1f cd 9d a8 3e 63 2b 84 35 69 65 84 c8 19 c5 46 22 f8 53 95 be e3 80 4a 10 c6 2a ec ba 97 20 11 c7 39 99 10 04 a0 f0 61 7a 95 25 8c 4e 52 75 e2 b6 ed 08 ca 14 fc ce 22 6a b3 4e cf 46 03 97 97 03 7e c0 b1 de 7b af 45 33 cf ba 3e 71 b7 de f4 25 25 c2 0d 35 89 9d 9d fb 0e 11 79 89 1e 37 c5 af 8e 72 69
After search and replace (all), you get:
9c 2a 04 77 5c d8 50 91 3a 06 a3 82 e0 d8 50 48
bc 89 3f f1 19 70 1a 88 46 7e e0 8f c5 f1 89 ce
21 ee 5a fe 61 0d b7 32 44 89 a0 74 0b 53 4f 55
a4 ce 82 62 95 ee eb 59 5f c6 e1 05 80 12 c4 5e
94 3f bc 5b 48 38 f4 53 f7 24 e6 fb 91 e9 15 c4
cf f4 53 0d f4 4a fc 9f 54 de 7d be a0 6b 6f 87
c0 d0 50 1f 28 30 03 40 da 08 73 51 6c 7f ff 3a
3c a7 37 06 8e bd 4b 11 04 eb 7d 24 de e6 f9 fc
31 71 fb 94 d5 60 f3 2e 4a af 42 d2 cb ea c4 6a
1a b2 cc 53 dd 15 4b 8b 1f c8 19 61 1f cd 9d a8
3e 63 2b 84 35 69 65 84 c8 19 c5 46 22 f8 53 95
be e3 80 4a 10 c6 2a ec ba 97 20 11 c7 39 99 10
04 a0 f0 61 7a 95 25 8c 4e 52 75 e2 b6 ed 08 ca
14 fc ce 22 6a b3 4e cf 46 03 97 97 03 7e c0 b1
de 7b af 45 33 cf ba 3e 71 b7 de f4 25 25 c2 0d
35 89 9d 9d fb 0e 11 79 89 1e 37 c5 af 8e 72 69
Hi I came across this solution and tried it with Sublime Text 3, works great. If I use the usual Alt+q on a python docstring, it will do pretty much what's desirable:
will limit the scope to the docstring
will have the beginning and ending ''' or """ properly done
It involved modifying one command from the default package. Please see here.
https://gist.github.com/SmileyChris/4340807
In Sublime Text 3, the default package is under \Packages\Default.sublime-package. You will have to unzip it, and find the file paragraph.py. Place (just) this file under your user package directory eg. \Data\Packages\Default\ so this file now overwrites the default package's paragraph.py.
Thanks to the original author Chris Beaven (SmileyChris).

unknown data encoding

While I was working with an old application with existing database which is in ms-access contains some strange data encoding such as 48001700030E0F465075465A56525E1100121D04121B565A58 as email address
What kind of data encoding is this? i tried base64 but it dosent seems that. Can anybody with previous experience with ms-access could tell me what possible encoding could this be.
edit:
more samples
54001700030E0F46507546474550481C1D09090D04461B565A195E5F
40001700030E0F4650755F564E545F06025D100E0C
38001700030E0F4650754545564654155C101C0C
46001700030E0F4650755D565150591D1B0007124F565A58
above samples are surely emails and for web url it looks like this
440505045D070D54585C5B50585D581C1701004F025A58
440505045D121147544C5B584D4B5D17015D100E4F5C5B
This is vb + ms access program if that can be any help and i think it some standard encoding
edit (2) ::
from looking at web url encoding it seems 0505045D could be for http://
edit(3) ::
1 combination found
52021301161209755354595A5E5F561D170B030E1341461B56585A == paresh#falmingoexports.com
It appears to be bytes encoded as hexadecimal. But what those bytes mean, I don't know. Decoding it to ASCII doesn't reveal much:
H \x00\x17\x00\x03\x0e\x0fFPu FZVR^ \x11\x00\x12\x1d\x04\x12\x1bVZX
T \x00\x17\x00\x03\x0e\x0fFPu FGEPH \x1c\x1d\t\t\r\x04F\x1bVZ\x19^_
# \x00\x17\x00\x03\x0e\x0fFPu _VNT_ \x06\x02]\x10\x0e\x0c
8 \x00\x17\x00\x03\x0e\x0fFPu EEVFT \x15\\\x10\x1c\x0c
F \x00\x17\x00\x03\x0e\x0fFPu ]VQPY \x1d\x1b\x00\x07\x12OVZX
Things I've noticed that may help crack the code:
The 2nd to 10th bytes appear to constant \x00\x17\x00\x03\x0e\x0fFPu.
The first byte is BCD length (spotted by Daniel Brückner!)
16th bytes onwards appear to some binary format that either encode the data or perhaps a pointer to the data.
Two of them end in: \x12?VZX.
The strings seem to be hexadecimal representations of some binary data.
The first two digits are the length of the string - decimal, not hexadecimal - so not the entire string is hexadecimal.
38 001700030E0F465075 4545 5646 5415 5C10 1C0C
40 001700030E0F465075 5F56 4E54 5F06 025D 100E 0C
46 001700030E0F465075 5D56 5150 591D 1B00 0712 4F56 5A58
48 001700030E0F465075 465A 5652 5E11 0012 1D04 121B 565A 58
54 001700030E0F465075 4647 4550 481C 1D09 090D 0446 1B56 5A19 5E5F
^ ^
| |
| +---- constant part, 9 bytes, maybe mailto: or same domain name of
| reversed email addresses (com.example#foo)
|
+---- length of the reset in decimal, not hexadecimal
I can see no clear indication for the location of the at-sign and the dot before the top-level domain. Seems to be an indication against simple mono-alphabetic substitutions like ROT13.
paresh#falmingoexports.com
Length
26 characters
Histogram
1x
h # f l i n g x t . c
3x o
2x p 2x a 2x m 2x r 2x e 2x s
ASCII values in hexadecimal representation
70 61 72 65 73 68 40 66 61 6C
6D 69 6E 67 6F 65 78 70 6F 72
74 73 2E 63 6F 6D
The length of 52 hexadecimal symbols matches length of the
encoded string.
52 02 13 01 16 12 09 75 53 54 59
5A 5E 5F 56 1D 17 0B 03 0E 13
41 46 1B 56 58 5A
Histogram
1x
01 02 03 09 0B 0E 12 16 17 1B
1D 41 46 53 54 58 59 5E 5F 75
2x 13 2x 56 2x 5A
The histograms don't match - so this rules out mono-alphabetic substitutions possibly followed by a permutation of the string.