How to search for substring preceded by a known substring - mysql

Using SQL, I want to search for and retrieve sub-strings that are preceded by a known sub-string "XX." and ending in a " " or "'".
For example if I start with CONTOOOH 788 XX. 3C, MNOP I need to extract value 3C
I've tried with substring(input, posisiton, len) but not sure about criteria of len since the len are vary.
select substring(input, position("XX." in input)+4, **???**)
from tables
where input like '%XX.%';
I am using MySQL.
Input
CONTOH LALALA 12 XX. 1 ABCD LALA NANA MAMA KAKA
CONTOH NANANANA 34 XX. 02 EFGH IJKL MN
CONTOOOH MAMAMA XX. 1A IJKL YOYO
CONTOOOH NANA XIXI 788 XX. 423C, MNOP QRSTU ASDF POIU
EXAMPLE BLA BLA HOHOHO 910 XX. A4, QRST ASDGHH
EXAMPLE ZZZ AAA BBB 1112 XX. BB5, UVWXASDGHH
Output
1
02
1A
423C
A4
BB5

One option uses SUBSTRING_INDEX with REPLACE:
SELECT
Input,
REPLACE(SUBSTRING_INDEX(
SUBSTRING_INDEX(Input, 'XX. ', -1), ' ', 1), ',', '') AS Output
FROM yourTable;
Demo
Here is how the string operations are working, step by step
CONTOOOH 788 XX. 3C, MNOP - initial input
3C, MNOP - after first call to SUBSTRING_INDEX
3C, - after second call to SUBSTRING_INDEX
3C - after call to REPLACE, to remove the comma

According to the documentation for MySQL substring
length Optional. The forms without a len argument return a substring from string str starting at position pos.
https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_substring
Thus if your current code works to your liking except for length, simply omit it, and it will return the entire substring.

Related

Append information in the th tags to td rows

I am an economist struggling with coding and data scraping.
I am scarping data from the main and unique table on this webpage (https://www.oddsportal.com/basketball/europe/euroleague-2013-2014/results/). I can retrieve all the information of the td HTML tags with python selenium by referring to the class element. The same goes for the th tag where it is stored the information of the date and stage of the competition. In my final dataset, I would like to have the information stored in the th tag in two rows (data and stage of the competition) next to the other rows in the table. Basically, for each match, I would like to have the date and the stage of the competition in rows and not as the head of each group of matches.
The only solution I came up with is to index all the rows (with both th and td tags) and build a while loop to append the information in the th tags to the td rows whose index is lower than the next index for the th tag. Hope I made myself clear (if not I will try to give a more graphical explanation). However, I am not able to code such a logic construct due to my poor coding abilities. I do not know if I need two loops to iterate through different tags (td and th) and in case how to do that. If you have any easier solution, it is more than welcome!
Thanks in advance for the precious help!
code below:
from selenium import webdriver
import time
import pandas as pd
# Season to filter
seasons_filt = ['2013-2014', '2014-2015', '2015-2016','2016-2017', '2017-2018', '2018-2019']
# Define empty data
data_keys = ["Season", "Match_Time", "Home_Team", "Away_Team", "Home_Odd", "Away_Odd", "Home_Score",
"Away_Score", "OT", "N_Bookmakers"]
data = dict()
for key in data_keys:
data[key] = list()
del data_keys
# Define 'driver' variable and launch browser
#path = "C:/Users/ALESSANDRO/Downloads/chromedriver_win32/chromedriver.exe"
#path office pc
path = "C:/Users/aldi/Downloads/chromedriver.exe"
driver = webdriver.Chrome(path)
# Loop through pages based on page_num and season
for season_filt in seasons_filt:
page_num = 0
while True:
page_num += 1
# Get url and navigate it
page_str = (1 - len(str(page_num)))* '0' + str(page_num)
url ="https://www.oddsportal.com/basketball/europe/euroleague-" + str(season_filt) + "/results/#/page/" + page_str + "/"
driver.get(url)
time.sleep(3)
# Check if page has no data
if driver.find_elements_by_id("emptyMsg"):
print("Season {} ended at page {}".format(season_filt, page_num))
break
try:
# Teams
for el in driver.find_elements_by_class_name('name.table-participant'):
el = el.text.strip().split(" - ")
data["Home_Team"].append(el[0])
data["Away_Team"].append(el[1])
data["Season"].append(season_filt)
# Scores
for el in driver.find_elements_by_class_name('center.bold.table-odds.table-score'):
el = el.text.split(":")
if el[1][-3:] == " OT":
data["OT"].append(True)
el[1] = el[1][:-3]
else:
data["OT"].append(False)
data["Home_Score"].append(el[0])
data["Away_Score"].append(el[1])
# Match times
for el in driver.find_elements_by_class_name("table-time"):
data["Match_Time"].append(el.text)
# Odds
i = 0
for el in driver.find_elements_by_class_name("odds-nowrp"):
i += 1
if i%2 == 0:
data["Away_Odd"].append(el.text)
else:
data["Home_Odd"].append(el.text)
# N_Bookmakers
for el in driver.find_elements_by_class_name("center.info-value"):
data["N_Bookmakers"].append(el.text)
# TODO think of inserting the dates list in the dataframe even if it has a different size (19 rows and not 50)
except:
pass
driver.quit()
data = pd.DataFrame(data)
data.to_csv("data_odds.csv", index = False)
I would like to add this information to my dataset as two additional rows:
for el in driver.find_elements_by_class_name("first2.tl")[1:]:
el = el.text.strip().split(" - ")
data["date"].append(el[0])
data["stage"].append(el[1])
Few things I would change here.
Don't overwrite variables. You store elements in your el variable, then you over write the element with your strings. It may work for you here, but you may get yourself into trouble with that practice later on, especially since you are iterating through those elements. It makes it hard to debug too.
I know Selenium has ways to parse the html. But I personally feel BeautifulSoup is a tad easier to parse with and is a little more intuitive if you are simply just trying to pull out data from the html. So I went with BeautifulSoup's .find_previous() to get the tags that precede the games, essentially then able to get your date and stage content.
Lastly, I like to construct a list of dictionaries to make up the data frame. Each item in the list is a dictionary key:value where the key is the column name and value is the data. You sort of do the opposite in creating a dictionary of lists. Now there is nothing wrong with that, but if the lists don't have the same length, you're get an error when trying to create the dataframe. Where as with my way, if for what ever reason there is a value missing, it will still create the dataframe, but will just have a null or nan for the missing data.
There may be more work you need to do with the code to go through the pages, but this gets you the data in the form you need.
Code:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import time
import pandas as pd
from bs4 import BeautifulSoup
import re
# Season to filter
seasons_filt = ['2013-2014', '2014-2015', '2015-2016','2016-2017', '2017-2018', '2018-2019']
# Define 'driver' variable and launch browser
path = "C:/Users/ALESSANDRO/Downloads/chromedriver_win32/chromedriver.exe"
driver = webdriver.Chrome(path)
rows = []
# Loop through pages based on page_num and season
for season_filt in seasons_filt:
page_num = 0
while True:
page_num += 1
# Get url and navigate it
page_str = (1 - len(str(page_num)))* '0' + str(page_num)
url ="https://www.oddsportal.com/basketball/europe/euroleague-" + str(season_filt) + "/results/#/page/" + page_str + "/"
driver.get(url)
time.sleep(3)
# Check if page has no data
if driver.find_elements_by_id("emptyMsg"):
print("Season {} ended at page {}".format(season_filt, page_num))
break
try:
soup = BeautifulSoup(driver.page_source, 'html.parser')
table = soup.find('table', {'id':'tournamentTable'})
trs = table.find_all('tr', {'class':re.compile('.*deactivate.*')})
for each in trs:
teams = each.find('td', {'class':'name table-participant'}).text.split(' - ')
scores = each.find('td', {'class':re.compile('.*table-score.*')}).text.split(':')
ot = False
for score in scores:
if 'OT' in score:
ot == True
scores = [x.replace('\xa0OT','') for x in scores]
matchTime = each.find('td', {'class':re.compile('.*table-time.*')}).text
# Odds
i = 0
for each_odd in each.find_all('td',{'class':"odds-nowrp"}):
i += 1
if i%2 == 0:
away_odd = each_odd.text
else:
home_odd = each_odd.text
n_bookmakers = soup.find('td',{'class':'center info-value'}).text
date_stage = each.find_previous('th', {'class':'first2 tl'}).text.split(' - ')
date = date_stage[0]
stage = date_stage[1]
row = {'Season':season_filt,
'Home_Team':teams[0],
'Away_Team':teams[1],
'Home_Score':scores[0],
'Away_Score':scores[1],
'OT':ot,
'Match_Time':matchTime,
'Home_Odd':home_odd,
'Away_Odd':away_odd,
'N_Bookmakers':n_bookmakers,
'Date':date,
'Stage':stage}
rows.append(row)
except:
pass
driver.quit()
data = pd.DataFrame(rows)
data.to_csv("data_odds.csv", index = False)
Output:
print(data.head(15).to_string())
Season Home_Team Away_Team Home_Score Away_Score OT Match_Time Home_Odd Away_Odd N_Bookmakers Date Stage
0 2013-2014 Real Madrid Maccabi Tel Aviv 86 98 False 18:00 -667 +493 7 18 May 2014 Final Four
1 2013-2014 Barcelona CSKA Moscow 93 78 False 15:00 -135 +112 7 18 May 2014 Final Four
2 2013-2014 Barcelona Real Madrid 62 100 False 19:00 +134 -161 7 16 May 2014 Final Four
3 2013-2014 CSKA Moscow Maccabi Tel Aviv 67 68 False 16:00 -278 +224 7 16 May 2014 Final Four
4 2013-2014 Real Madrid Olympiacos 83 69 False 18:45 -500 +374 7 25 Apr 2014 Play Offs
5 2013-2014 CSKA Moscow Panathinaikos 74 44 False 16:00 -370 +295 7 25 Apr 2014 Play Offs
6 2013-2014 Olympiacos Real Madrid 71 62 False 18:45 +127 -152 7 23 Apr 2014 Play Offs
7 2013-2014 Maccabi Tel Aviv Olimpia Milano 86 66 False 17:45 -217 +179 7 23 Apr 2014 Play Offs
8 2013-2014 Panathinaikos CSKA Moscow 73 72 False 16:30 -106 -112 7 23 Apr 2014 Play Offs
9 2013-2014 Panathinaikos CSKA Moscow 65 59 False 18:45 -125 +104 7 21 Apr 2014 Play Offs
10 2013-2014 Maccabi Tel Aviv Olimpia Milano 75 63 False 18:15 -189 +156 7 21 Apr 2014 Play Offs
11 2013-2014 Olympiacos Real Madrid 78 76 False 17:00 +104 -125 7 21 Apr 2014 Play Offs
12 2013-2014 Galatasaray Barcelona 75 78 False 17:00 +264 -333 7 20 Apr 2014 Play Offs
13 2013-2014 Olimpia Milano Maccabi Tel Aviv 91 77 False 18:45 -286 +227 7 18 Apr 2014 Play Offs
14 2013-2014 CSKA Moscow Panathinaikos 77 51 False 16:15 -303 +247 7 18 Apr 2014 Play Offs

MySQL: SQL to remove country code from phone number

In a MySQL table [customers], I have a phone column [phone] that has phone numbers in this format:
[phone] = [+ sign][country code][space][number]
The [number] part can be in any format:
1234567890
123.456.7890
123 456 7890
(123) 456 7890
etc.
But the full phone number column value [phone] always start with [+ sign][country code][space]
Example:
[phone] values are:
+1 888-888-8888
+1 888.888 (8888)
+31 104232385
+33 143375100
+31 10 423 2385
+33 1 43 37 51 00
I want to remove the country code: [+ sign][country code][space] from all phone numbers [phone]
Example:
+1 888-888-8888 = 888-888-8888
+1 888.888 (8888) = 888.888 (8888)
+31 104232385 = 104232385
+33 143375100 = 143375100
+31 10 423 2385 = 10 423 2385
+33 1 43 37 51 00 = 1 43 37 51 00
What I want is to remove all leading characters till first space in the column. Which will always be: [+ sign][country code]
Or in other words: Remove everything before the first occurrence of certain character: [space] in MySQL?
What would be the query for it?
One way of doing this would be to do a regex replacement:
UPDATE customers
SET phone = REGEXP_REPLACE(phone, '^\\+[0-9]+ ', '');
Demo
If you're using a version of MySQL earlier than 8+, then we can use the base string functions as a workaround:
UPDATE customers
SET phone = SUBSTR(phone, INSTR(phone, ' ') + 1)
WHERE phone LIKE '+%';
Demo

SQL: select between delimiters with delimiters itself, insert in other column and delete string

Instead of having one column for each group of values, I made one column named "data" and used HTML like this:
<dt>Phone:</dt><dd>0 23 16/3 82 73 42 23</dd>
<dt>Phone:</dt><dd>0 21 61/81 26 73 13 22</dd>
<dt>Fax:</dt><dd>03 27/3 87 42 37 32</dd>
<dt>Website:</dt><dd>www.example.com</dd>
Now, I recognized, that wasn't very clever and I made a column for each value. My new columns names are "phone", "phone2", "fax" and "website".
I need an SQL code for e.g. selecting all between the delimiters <dt>Phone:</dt><dd> and </dd> and the delimiters itself, insert this string in the column "phone" and delete this string in the "data" column.
But I need to select the first string <dt>Phone:</dt><dd>0 23 16/3 82 73 42 23</dd> not the second <dt>Phone:</dt><dd>0 21 61/81 26 73 13 22</dd>.
Can anybody give me a hint how to do that?
For selecting data between <dt>Phone:</dt><dd> and </dd> you can use SUBSTRING_INDEX.
Like this
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(data, '</dd>', 1), '<dt>Phone:</dt><dd>', -1) as phone,
SUBSTRING_INDEX(SUBSTRING_INDEX(data, '<dt>Phone:</dt><dd>', -1), '</dd>', 1) as phone2,
SUBSTRING_INDEX(SUBSTRING_INDEX(data, '<dt>Fax:</dt><dd>', -1), '</dd>', 1) as fax,
SUBSTRING_INDEX(SUBSTRING_INDEX(data, '<dt>Website:</dt><dd>', -1), '</dd>', 1) as website
from data_col;
Update:
<dt>Phone:</dt> is not always in top.
in case if there is no specified order of data in "data" column, try this one:
SELECT
IF (temp.f_phone > 0, SUBSTR(data, temp.f_phone + LENGTH('<dt>Phone:</dt><dd>'), f_phone_end - temp.f_phone - LENGTH('<dt>Phone:</dt><dd>')), null) as PHONE_1,
IF (temp.s_phone > 0, SUBSTR(data, temp.s_phone + LENGTH('<dt>Phone:</dt><dd>'), s_phone_end - temp.s_phone - LENGTH('<dt>Phone:</dt><dd>')), null) as PHONE_2
from data_col dc
JOIN (
SELECT id, #f_phone:= LOCATE('<dt>Phone:</dt><dd>', data) as f_phone,
LOCATE('</dd>', data, #f_phone+1) f_phone_end,
#s_phone := LOCATE('<dt>Phone:</dt><dd>', data, #f_phone+1) as s_phone,
LOCATE('</dd>', data, #s_phone+1) as s_phone_end
from data_col) temp ON temp.id = dc.id;
First find the starting position of each possible element (e.g. "phone" "phone2") and a position of closing tag <\dd>. And than use SUBSTR from starting position of element + length of delimiter, with length = end_position - start_position - delimiter_length

Using MySQL SELECT WHERE IN with multibyte characters

I have a table of all defined Unicode characters (the character column) and their associated Unicode points (the id column). I have the following query:
SELECT id FROM unicode WHERE `character` IN ('A', 'B', 'C')
While this query should return only 3 rows (id = 65, 66, 67), it instead returns 129 rows including the following IDs:
65 66 67 97 98 99 129 141 143 144 157 160 193 205 207 208 221 224 257
269 271 272 285 288 321 333 335 336 349 352 449 461 463 464 477 480
2049 2061 2063 2064 2077 2080 4161 4173 4175 4176 4189 4192 4929 4941
4943 4944 4957 4960 5057 5069 5071 5072 5085 5088 5121 5133 5135 5136
5149 5152 5953 5965 5967 5968 5984 6145 6157 6160 6176 8257 8269 8271
8272 8285 8288 9025 9037 9039 9040 9053 9056 9153 9165 9167 9168 9181
9184 9217 9229 9231 9232 9245 9248 10049 10061 10063 10064 10077 10080
10241 10253 10255 10256 10269 10272 12353 12365 12367 12368 12381
12384 13121 13133 13135 13136 13149 13152 13249 13261 13263 13264
13277 13280
I'm sure this must have something to do with multi-byte characters but I'm not sure how to fix it. Any ideas what's going on here?
String equality and order is governed by a collation. By default the collation used is determined from the column, but you can set the collation per-query with the COLLATE clause. For example, if your columns are declared with charset utf8 you could use utf8_bin to use a binary collation that considers A and à different:
SELECT id FROM unicode WHERE `character` COLLATE utf8_bin IN ('A', 'B', 'C')
Alternatively you could use the BINARY operator to convert character into a "binary string" which forces the use of a binary comparison, which is almost but not quite the same as binary collation:
SELECT id FROM unicode WHERE BINARY `character` IN ('A', 'B', 'C')
Update: I thought that the following should be equivalent, but it's not because a column has lower "coercibility" than the constants. The binary string constants would be converted into non-binary and then compared.
SELECT id FROM unicode WHERE `character` IN (_binary'A', _binary'B', _binary'C')
You can try:
SELECT id FROM unicode WHERE 'character' IN (_utf8'A',_utf8'B',_utf8'C')

Code Golf: Playing Tetris

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
The basics:
Consider the following tetrominoes and empty playing field:
0123456789
I O Z T L S J [ ]
[ ]
# ## ## ### # ## # [ ]
# ## ## # # ## # [ ]
# ## ## [ ]
# [ ]
[==========]
The dimensions of the playing field are fixed. The numbers at the top are just here
to indicate the column number (also see input).
Input:
1. You are given a specific playing field (based on the above) which can already be filled partly
with tetrominoes (this can be in a separate file or provided via stdin).
Sample input:
[ ]
[ ]
[ ]
[ ]
[ # # #]
[ ## ######]
[==========]
2. You are given a string which describes (separated by spaces) which tetromino to insert (and
drop down) at which column. Tetrominoes don't need to be rotated. Input can be read from stdin.
Sample input:
T2 Z6 I0 T7
You can assume input is 'well-formed' (or produce undefined behaviour when it's not).
Output
Render the resulting field ('full' lines must disappear) and print the score count
(every dropped line accounts for 10 points).
Sample output based on the sample input above:
[ ]
[ ]
[ ]
[# ###]
[# ### ]
[##### ####]
[==========]
10
Winner:
Shortest solution (by code character count). Usage examples are nice. Have fun golfing!
Edit: added a bounty of +500 reputation to draw some more attention to the nice efforts the answerers already made (and possibly some new solutions to this question)...
GolfScript - 181 characters
Newlines are not necessary. Output is in standard output, although some errors are present in stderr.
\10 should be replaced by the corresponding ASCII character for the program to be 181 characters.
{):X!-{2B{" #"=}%X" ":f*+-1%}%:P;:>.{\!:F;>P{\(#{3&\(#.2$&F|:F;|}%\+}%\+F![f]P+:P
;}do;{"= "&},.,7^.R+:R;[>0="#"/f*]*\+}0"R#1(XBc_""~\10"{base}:B/3/~4*"nIOZTLSJR "
";:"*~;n%)n*~ 10R*+n*
Sample I/O:
$ cat inp
[ ]
[ ]
[ ]
[ ]
[ # # #]
[ ## ######]
[==========]
T2 Z6 I0 T7
$ cat inp|golfscript tetris.gs 2>/dev/null
[ ]
[ ]
[ ]
[# ###]
[# ### ]
[##### ####]
[==========]
10
Tetromino compression:
Pieces are stored as three base 8 digits. This is a simple binary representation, e.g.T=[7,2,0], S=[6,3,0], J=[2,2,3]. [1] is used for the I piece in compression, but this is explicitly set to [1,1,1,1] later (i.e. the 4* in the code). All of these arrays are concatenated into a single array, which is converted into an integer, and then a string (base 126 to minimize non-printable characters, length, and not encounter utf8). This string is very short: "R#1(XBc_".
Decompression is then straightforward. We first do a base 126 conversion followed by a base 8 conversion ("~\10"{base}/, i.e. iterate through "~\10" and do a base conversion for each element). The resulting array is split into groups of 3, the array for I is fixed (3/~4*). We then convert each element to base 2 and (after removing zeros) replace each binary digit with the character of that index in the string " #" (2base{" #"=}%...-1% - note that we need to reverse the array otherwise 2 would become "# " instead of " #").
Board/piece format, dropping pieces
The board is simply an array of strings, one for each line. No work is initially done on this, so we can generate it with n/( on the input. Pieces are also arrays of strings, padded with spaces to the left for their X position, but without trailing spaces. Pieces are dropped by prepending to the array, and continuously testing whether there is a collision.
Collision testing is done by iterating through all characters in the piece, and comparing against the character of the same position on the board. We want to regard #+= and #+# as collisions, so we test whether ((piecechar&3)&boardchar) is nonzero. While doing this iteration, we also update (a copy of) the board with ((piecechar&3)|boardchar), which correctly sets the value for pairs #+, +#, +[. We use this updated board if there is a collision after moving the piece down another row.
Removing filled rows is quite simple. We remove all rows for which "= "& return false. A filled row will have neither = or , so the conjunction will be a blank string, which equates to false. Then we count the number of rows that have been removed, add the count to the score and prepend that many "[ ... ]"s. We generate this compactly by taking the first row of the grid and replacing # with .
Bonus
Since we compute what the board would look like in each position of the piece as it falls, we can keep these on the stack instead of deleting them! For a total of three characters more, we can output all these positions (or two characters if we have the board states single spaced).
{):X!-{2B{" #"=}%X" ":f*+-1%}%:P;:>.{>[f]P+:P(!:F;{\(#{3&\(#.2$&F|:F;|}%\+}%\+F!}
do;{"= "&},.,7^.R+:R;[>0="#"/f*]*\+}0"R#1(XBc_""~\10"{base}:B/3/~4*"nIOZTLSJR "
";:"*~;n%)n*~ ]{n*n.}/10R*
Perl, 586 523 483 472 427 407 404 386 387 356 353 chars
(Needs Perl 5.10 for the defined-or // operator).
Takes all input from stdin. Still needs some serious golfing.
Note that ^Q represents ASCII 17 (DC1/XON), ^C represents ASCII 3 and ^# represents ASCII 0 (NUL).
while(<>){push#A,[split//]if/]/;while(/\w/g){for$i(0..6){for($f=0,$j=4;$j--;){$c=0;map{if($_){$i--,$f=$j=3,redo if$A[$k=$i+$j][$C=$c+$'+1]ne$";$A[$k][$C]="#"if$f}$c++}split//,unpack"b*",chr vec"3^#'^#c^#^Q^C6^#\"^C^Q^Q",index(OTZLSJI,$&)*4+$j,4;$s+=10,#A[0..$k]=#A[$k,0..$k-1],map{s/#/ /}#{$A[0]},$i++if 9<grep/#/,#{$A[$k]}}last if$f}}}print+(map#$_,#A),$s//0,$/
Commented version:
while(<>){
# store the playfield as an AoA of chars
push#A,[split//]if/]/;
# while we're getting pieces
while(/\w/g){
# for each line of playfield
for$i(0..6){
# for each line of current piece
for($f=0,$j=4;$j--;){
# for each column of current piece
$c=0;
map{
if($_){
# if there's a collision, restart loop over piece lines
# with a mark set and playfield line decremented
$i--,$f=$j=3,redo if$A[$k=$i+$j][$C=$c+$'+1]ne$";
# if we already found a collision, draw piece
$A[$k][$C]="#"if$f
}
$c++
# pieces are stored as a bit vector, 16 bits (4x4) per piece,
# expand into array of 1's and 0's
}split//,unpack"b*",chr vec"3^#'^#c^#^Q^C6^#\"^C^Q^Q",index(OTZLSJI,$&)*4+$j,4;
# if this playfield line is full, remove it. Done by array slicing
# and substituting all "#"'s in line 0 with " "'s
$s+=10,#A[0..$k]=#A[$k,0..$k-1],map{s/#/ /}#{$A[0]},$i++if 9<grep/#/,#{$A[$k]}
}
# if we found a collision, stop iterating over the playfield and get next piece from input
last if$f
}
}
}
# print everything
print+(map#$_,#A),$s//0,$/
Edit 1: some serious golfing, fix output bug.
Edit 2: some inlining, merged two loops into one for a net saving of (drum roll...) 3 chars, misc golfing.
Edit 3: some common subexpression elimination, a little constant merging and tweaked a regex.
Edit 4: changed representation of tetrominoes into a packed bit vector, misc golfing.
Edit 5: more direct translation from tetromino letter to array index, use non-printable characters, misc golfing.
Edit 6: fixed bug cleaning top line, introduced in r3 (edit 2), spotted by Nakilon. Use more non-printable chars.
Edit 7: use vec for getting at tetromino data. Take advantage of the fact that the playfield has fixed dimensions. if statement => if modifier, the merging of loops of edit 2 starts paying off. Use // for the 0-score case.
Edit 8: fixed another bug, introduced in r6 (edit 5), spotted by Nakilon.
Edit 9: don't create new references when clearing lines, just move references around via array slicing. Merge two map's into one. Smarter regex. "Smarter" for. Misc golfings.
Edit 10: inlined tetromino array, added commented version.
Ruby — 427 408 398 369 359
t=[*$<]
o=0
u=->f{f.transpose}
a=u[t.reverse.join.scan /#{'( |#)'*10}/]
t.pop.split.map{|w|m=(g='I4O22Z0121T01201L31S1201J13'[/#{w[0]}\d+/].scan(/0?\d/).zip a.drop w[1].to_i).map{|r,b|(b.rindex ?#or-1)-r.size+1}.max
g.map{|r,b|b.fill ?#,m+r.size,r.to_i}
v=u[a]
v.reject!{|i|i-[?#]==[]&&(o+=10;v)<<[' ']*10}
a=u[v]}
puts u[a].reverse.map{|i|?[+i*''+?]},t[-1],o
Bash shell script (301 304 characters)
UPDATE: Fixed a bug involving pieces that extend into the top row. Also, the output is now sent to standard out, and as a bonus, it is possible to run the script again to continue playing a game (in which case you must add up the total score yourself).
This includes nonprintable characters, so I have provided a hex dump. Save it as tetris.txt:
0000000: 7461 696c 202d 3120 245f 7c7a 6361 743e tail -1 $_|zcat>
0000010: 753b 2e20 750a 1f8b 0800 35b0 b34c 0203 u;. u.....5..L..
0000020: 5590 516b 8330 10c7 dff3 296e 4c88 ae64 U.Qk.0....)nL..d
0000030: a863 0c4a f57d 63b0 07f7 b452 88d1 b4da .c.J.}c....R....
0000040: 1a5d 5369 91a6 df7d 899a d05d 5e72 bfbb .]Si...}...]^r..
0000050: fbff 2fe1 45d5 0196 7cff 6cce f272 7c10 ../.E...|.l..r|.
0000060: 387d 477c c4b1 e695 855f 77d0 b29f 99bd 8}G|....._w.....
0000070: 98c6 c8d2 ef99 8eaa b1a5 9f33 6d8c 40ec ...........3m.#.
0000080: 6433 8bc7 eeca b57f a06d 27a1 4765 07e6 d3.......m'.Ge..
0000090: 3240 dd02 3df1 2344 f04a 0d1d c748 0bde 2#..=.#D.J...H..
00000a0: 75b8 ed0f 9eef 7bd7 7e19 dd16 5110 34aa u.....{.~...Q.4.
00000b0: c87b 2060 48a8 993a d7c0 d210 ed24 ff85 .{ `H..:.....$..
00000c0: c405 8834 548a 499e 1fd0 1a68 2f81 1425 ...4T.I....h/..%
00000d0: e047 bc62 ea52 e884 42f2 0f0b 8b37 764c .G.b.R..B....7vL
00000e0: 17f9 544a 5bbd 54cb 9171 6e53 3679 91b3 ..TJ[.T..qnS6y..
00000f0: 2eba c07a 0981 f4a6 d922 89c2 279f 1ab5 ...z....."..'...
0000100: 0656 c028 7177 4183 2040 033f 015e 838b .V.(qwA. #.?.^..
0000110: 0d56 15cf 4b20 6ff3 d384 eaf3 bad1 b9b6 .V..K o.........
0000120: 72be 6cfa 4b2f fb03 45fc cd51 d601 0000 r.l.K/..E..Q....
Then, at the bash command prompt, preferably with elvis rather than vim installed as vi:
$ xxd -r tetris.txt tetris.sh
$ chmod +x tetris.sh
$ cat << EOF > b
> [ ]
> [ ]
> [ ]
> [ ]
> [ # # #]
> [ ## ######]
> [==========]
> EOF
$ ./tetris.sh T2 Z6 I0 T7 2>/dev/null
-- removed stuff that is not in standard out --
[ ]
[ ]
[ ]
[# ###]
[# ### ]
[##### ####]
[==========]
10
How it works
The code self-extracts itself similarly to how executable programs compressed using the gzexe script do. Tetromino pieces are represented as sequences of vi editor commands. Character counting is used to detect collisions, and line counting is used to calculate the score.
The unzipped code:
echo 'rej.j.j.:wq!m'>I
echo '2rejh.:wq!m'>O
echo '2rej.:wq!m'>Z
echo '3rejh1.:wq!m'>T
echo 'rej.j2.:wq!m'>L
echo 'l2rej2h.:wq!m'>S
echo 'lrej.jh2.:wq!m'>J
for t
do for y in `seq 1 5`
do echo -n ${y}jk$((${t:1}+1))l|cat - ${t:0:1}|vi b>0
grep ========== m>0||break
[ `tr -cd '#'<b|wc -c` = `tr -cd '#'<m|wc -c` ]||break
tr e '#'<m>n
done
cat n>b
grep -v '##########' b>m
$((S+=10*(`wc -l < b`-`wc -l < m`)))
yes '[ ]'|head -7|cat - m|tail -7>b
done
cat b
echo $S
The original code before golfing:
#!/bin/bash
mkpieces() {
pieces=('r#j.j.j.' '2r#jh.' '2r#j.' '3r#jh1.' 'r#j.j2.' 'l2r#j2h.' 'lr#j.jh2.')
letters=(I O Z T L S J)
for j in `seq 0 9`; do
for i in `seq 0 6`; do
echo "jk$(($j+1))l${pieces[$i]}:wq! temp" > ${letters[$i]}$j
done
done
}
counthashes() {
tr -cd '#' < $1 | wc -c
}
droppiece() {
for y in `seq 1 5`; do
echo -n $y | cat - $1 | vi board > /dev/null
egrep '={10}' temp > /dev/null || break
[ `counthashes board` -eq `counthashes temp` ] || break
tr # "#" < temp > newboard
done
cp newboard board
}
removelines() {
egrep -v '#{10}' board > temp
SCORE=$(($SCORE + 10 * (`wc -l < board` - `wc -l < temp`)))
yes '[ ]' | head -7 | cat - temp | tail -7 > board
}
SCORE=0
mkpieces
for piece; do
droppiece $piece
removelines
done
cat board
echo $SCORE
Python: 504 519 chars
(Python 3 solution) Currently requires to set the input in the format as shown at the top (input code is not counted). I'll expand to read from file or stdin later. Now works with a prompt, just paste the input in (8 lines total).
R=range
f,p=[input()[1:11]for i in R(7)],p
for(a,b)in input().split():
t=[' '*int(b)+r+' '*9for r in{'I':'#,#,#,#','O':'##,##','Z':'##, ##','T':'###, # ','L':'#,#,##','S':' ##,##','J':' #, #,##'}[a].split(',')]
for r in R(6-len(t),0,-1):
for i in R(len(t)):
if any(a==b=='#'for(a,b)in zip(t[i],f[r+i])):break
else:
for i in R(0,len(t)):
f[r+i]=''.join(a if b!='#'else b for(a,b)in zip(t[i],f[r+i]))
if f[r+i]=='#'*10:del f[r+i];f[0:0]=[' '*10];p+=10
break
print('\n'.join('['+r+']'for r in f[:7]),p,sep='\n')
Not sure if I can save much more there. Quite a lot characters are lost from the transformation to bitfields, but that saves a lot more characters than working with the strings. Also I'm not sure if I can remove more whitespace there, but I'll try it later.
Won't be able to reduce it much more; after having the bitfield-based solution, I transitioned back to strings, as I found a way to compress it more (saved 8 characters over the bitfield!). But given that I forgot to include the L and had an error with the points inside, my character count only goes up sigh... Maybe I find something later to compress it a bit more, but I think I'm near the end. For the original and commented code see below:
Original version:
field = [ input()[1:11] for i in range(7) ] + [ 0, input() ]
# harcoded tetrominoes
tetrominoes = {'I':('#','#','#','#'),'O':('##','##'),'Z':('##',' ##'),'T':('###',' # '),'L':('#','#','##'),'S':(' ##','##'),'J':(' #',' #','##')}
for ( f, c ) in field[8].split():
# shift tetromino to the correct column
tetromino = [ ' ' * int(c) + r + ' ' * 9 for r in tetrominoes[f] ]
# find the correct row to insert
for r in range( 6 - len( tetromino ), 0, -1 ):
for i in range( len( tetromino ) ):
if any( a == b == '#' for (a,b) in zip( tetromino[i], field[r+i] ) ):
# skip the row if some pieces overlap
break
else:
# didn't break, insert the tetromino
for i in range( 0, len( tetromino ) ):
# merge the tetromino with the field
field[r+i] = ''.join( a if b != '#' else b for (a,b) in zip( tetromino[i], field[r+i] ) )
# check for completely filled rows
if field[r+i] == '#' * 10:
# remove current row
del field[r+i]
# add new row
field[0:0] = [' '*10]
field[7] += 10
# we found the row, so abort here
break
# print it in the requested format
print( '\n'.join( '[' + r + ']' for r in field[:7] ) )
# and add the points = 10 * the number of redundant lines at the end
print( str( field[7] ) )
Ruby 1.9, 357 355 353 339 330 310 309 chars
d=0
e=[*$<]
e.pop.split.map{|f|f="L\003\003\007J\005\005\007O\007\007Z\007\013S\013\007I\003\003\003\003T\017\005"[/#{f[j=0]}(\W*)/,1].bytes.map{|z|?\0+?\0*f[1].hex+z.to_s(2).tr("01"," #")[1,9]}
k,f,i=i,[p]+f,e.zip(f).map{|l,m|l.bytes.zip(m.to_s.bytes).map{|n,o|j|=n&3&q=o||0;(n|q).chr}*""}until j>0
e=[]
e+=k.reject{|r|r.sum==544&&e<<r.tr(?#,?\s)&&d+=10}}
puts e,d
Note that the \000 escapes (including the null bytes on the third line) should be replaced with their actual nonprintable equivalent.
Sample input:
[ ]
[ ]
[ ]
[ ]
[ # # #]
[ ## ######]
[==========]
T2 Z6 I0 T7
Usage:
ruby1.9 tetris.rb < input
or
ruby1.9 tetris.rb input
C, 727 [...] 596 581 556 517 496 471 461 457 chars
This is my first code golf, I think character count can get much lower, would be nice if experienced golfers can give me some hints.
The current version can handle playfields with different dimensions, too. The input can have linebreaks in both DOS/Windows and Unix format.
The code was pretty straightforward before optimization, the tetrominoes are stored in 4 integers that are interpreted as an (7*3)x4 bit array, the playfield is stored as-is, tiles are dropped and complete lines are removed at start and after each tile drop.
I wasn't sure how to count characters, so I used the filesize of the code with all unneccessary linebreaks removed.
EDIT 596=>581: Thanks to KitsuneYMG, everything except the %ls suggestion worked perfectly, additionally, I noticed putch instead of putchar can be used (getch somehow doesn't work) and removed all the parentheses in #define G.
EDIT 581=>556: Wasn't satisfied with the remaining for and the nested F loops, so there was some merging, changing and removing of loops, quite confusing but definitely worth it.
EDIT 556=>517: Finally found a way to make a an int array. Some N; merged with c, no break anymore.
EDIT 496=>471: Playfield width and height fixed now.
EDIT 471=>461: Minor modifications, putchar used again as putch is no standard function.
EDIT: Bugfix, complete lines were removed before tile drop instead of after, so complete lines could be left at the end. Fix doesn't change the character count.
#define N (c=getchar())
#define G T[j%4]&1<<t*3+j/4
#define X j%4*w+x+j/4
#define F(x,m) for(x=0;x<m;x++)
#define W while
T[]={916561,992849,217,1},C[99],c,i,j,s,t,x,A,a[99],w=13;
main(){F(j,7)C["IJLSTZO"[j]]=j;
F(j,91)a[j]=N;
W(N>w){t=C[c];x=N-86;
W(c){F(j,12)if(G&&X>1?a[X]-32:0)c=0;
F(j,12)if(G&&X>w&&!c)a[X-w]=35;x+=w;}N;
F(i,6){A=0;t=i*w;F(x,w)A|=(a[t+x]==32);
if(!A){s++;F(j,t)a[t+w-j]=a[t-j];
x=1;W(a[x]-93)a[x++]=32;}}}
F(i,91)putchar(a[i]);printf("%i0",s);}
Python 2.6+ - 334 322 316 characters
397 368 366 characters uncompressed
#coding:l1
exec'xÚEPMO!½ï¯ i,P*Ýlš%ì­‰=‰Ö–*†­þz©‰:‡—Lò¾fÜ”bžAù,MVi™.ÐlǃwÁ„eQL&•uÏÔ‹¿1O6ǘ.€LSLÓ’¼›î”3òšL¸tŠv[ѵl»h;ÁºŽñÝ0Àë»Ç‡ÛûH.ª€¼âBNjr}¹„V5¾3Dë#¼¡•gO. ¾ô6 çÊsÃЮürÃ1&›ßVˆ­ùZ`Ü€ÿžcx±ˆ‹sCàŽ êüRô{U¯ZÕDüE+³ŽFA÷{CjùYö„÷¦¯Î[0þøõ…(Îd®_›â»E#–Y%’›”ëýÒ·X‹d¼.ß9‡kD'.decode('zip')
The single newline is required, and I've counted it as one character.
Browser code-page mumbo jumbo might prevent a successful copy-and-paste of this code, so you can optionally generate the file from this code:
s = """
23 63 6F 64 69 6E 67 3A 6C 31 0A 65 78 65 63 27 78 DA 45 50 4D 4F 03 21
10 BD EF AF 20 69 2C 50 2A 02 DD 6C 9A 25 EC AD 07 8D 89 07 3D 89 1C D6
96 2A 86 05 02 1B AD FE 7A A9 89 3A 87 97 4C F2 BE 66 DC 94 62 9E 41 F9
2C 4D 56 15 69 99 0F 2E D0 6C C7 83 77 C1 16 84 65 51 4C 26 95 75 CF 8D
1C 15 D4 8B BF 31 4F 01 36 C7 98 81 07 2E 80 4C 53 4C 08 D3 92 BC 9B 11
EE 1B 10 94 0B 33 F2 9A 1B 4C B8 74 8A 9D 76 5B D1 B5 6C BB 13 9D 68 3B
C1 BA 8E F1 DD 30 C0 EB BB C7 87 DB FB 1B 48 8F 2E 1C AA 80 19 BC E2 42
4E 6A 72 01 7D B9 84 56 35 BE 33 44 8F 06 EB 40 BC A1 95 67 4F 08 2E 20
BE F4 36 A0 E7 CA 73 C3 D0 AE FC 72 C3 31 26 9B DF 56 88 AD F9 5A 60 DC
80 FF 9E 63 78 B1 88 8B 73 43 E0 8E A0 EA FC 52 F4 7B 55 8D AF 5A 19 D5
44 FC 45 2B B3 8E 46 9D 41 F7 7B 43 6A 12 F9 59 F6 84 F7 A6 01 1F AF CE
5B 30 FE F8 F5 85 28 CE 64 AE 5F 9B E2 BB 45 23 96 59 25 92 9B 94 EB FD
10 D2 B7 58 8B 64 BC 2E DF 39 87 6B 44 27 2E 64 65 63 6F 64 65 28 27 7A
69 70 27 29
"""
with open('golftris.py', 'wb') as f:
f.write(''.join(chr(int(i, 16)) for i in s.split()))
Testing
intetris
[ ]
[ ]
[ ]
[ ]
[ # # #]
[ ## ######]
[==========]
T2 Z6 I0 T7
Newlines must be Unix-style (linefeed only). A trailing newline on the last line is optional.
To test:
> python golftris.py < intetris
[ ]
[ ]
[ ]
[# ###]
[# ### ]
[##### ####]
[==========]
10
This code unzips the original code, and executes it with exec. This decompressed code weighs in at 366 characters and looks like this:
import sys
r=sys.stdin.readlines();s=0;p=r[:1];a='[##########]\n'
for l in r.pop().split():
n=int(l[1])+1;i=0xE826408E26246206601E>>'IOZTLSJ'.find(l[0])*12;m=min(zip(*r[:6]+[a])[n+l].index('#')-len(bin(i>>4*l&31))+3for l in(0,1,2))
for l in range(12):
if i>>l&2:c=n+l/4;o=m+l%4;r[o]=r[o][:c]+'#'+r[o][c+1:]
while a in r:s+=10;r.remove(a);r=p+r
print''.join(r),s
Newlines are required, and are one character each.
Don't try to read this code. The variable names are literally chosen at random in search of the highest compression (with different variable names, I saw as much as 342 characters after compression). A more understandable version follows:
import sys
board = sys.stdin.readlines()
score = 0
blank = board[:1] # notice that I rely on the first line being blank
full = '[##########]\n'
for piece in board.pop().split():
column = int(piece[1]) + 1 # "+ 1" to skip the '[' at the start of the line
# explanation of these three lines after the code
bits = 0xE826408E26246206601E >> 'IOZTLSJ'.find(piece[0]) * 12
drop = min(zip(*board[:6]+[full])[column + x].index('#') -
len(bin(bits >> 4 * x & 31)) + 3 for x in (0, 1, 2))
for i in range(12):
if bits >> i & 2: # if the current cell should be a '#'
x = column + i / 4
y = drop + i % 4
board[y] = board[y][:x] + '#' + board[y][x + 1:]
while full in board: # if there is a full line,
score += 10 # score it,
board.remove(full) # remove it,
board = blank + board # and replace it with a blank line at top
print ''.join(board), score
The crux is in the three cryptic lines I said I'd explain.
The shape of the tetrominoes is encoded in the hexadecimal number there. Each tetronimo is considered to occupy a 3x4 grid of cells, where each cell is either blank (a space) or full (a number sign). Each piece is then encoded with 3 hexadecimal digits, each digit describing one 4-cell column. The least significant digits describe the left-most columns, and the least significant bit in each digit describes the top-most cell in each column. If a bit is 0, then that cell is blank, otherwise it's a '#'. For example, the I tetronimo is encoded as 00F, with the four bits of the least-significant digit set on to encode the four number signs in the left-most column, and the T is 131, with the top bit set on the left and the right, and the top two bits set in the middle.
The entire hexadecimal number is then shift one bit to the left (multiplied by two). This will allow us to ignore the bottom-most bit. I'll explain why in a minute.
So given the current piece from the input, we find the index into this hexadecimal number where the 12 bits describing it's shape begin, then shift that down so that bits 1–12 (skipping bit 0) of the bits variable describe the current piece.
The assignment to drop determines how many rows from the top of the grid the piece will fall before landing on other piece fragments. The first line finds how many empty cells there are at the top of each column of the playing field, while the second finds the lowest occupied cell in each column of the piece. The zip function returns a list of tuples, where each tuple consists of the nth cell from each item in the input list. So, using the sample input board, zip(board[:6] + [full]) will return:
[
('[', '[', '[', '[', '[', '[', '['),
(' ', ' ', ' ', ' ', ' ', ' ', '#'),
(' ', ' ', ' ', ' ', '#', '#', '#'),
(' ', ' ', ' ', ' ', ' ', '#', '#'),
(' ', ' ', ' ', ' ', ' ', ' ', '#'),
(' ', ' ', ' ', ' ', ' ', '#', '#'),
(' ', ' ', ' ', ' ', ' ', '#', '#'),
(' ', ' ', ' ', ' ', '#', '#', '#'),
(' ', ' ', ' ', ' ', ' ', '#', '#'),
(' ', ' ', ' ', ' ', ' ', '#', '#'),
(' ', ' ', ' ', ' ', '#', '#', '#'),
(']', ']', ']', ']', ']', ']', ']')
]
We select the tuple from this list corresponding to the appropriate column, and find the index of the first '#' in the column. This is why we appended a "full" row before calling zip, so that index will have a sensible return (instead of throwing an exception) when the column is otherwise blank.
Then to find the lowest '#' in each column of the piece, we shift and mask the four bits that describe that column, then use the bin function to turn that into a string of ones and zeros. The bin function only returns significant bits, so we need only calculate the length of this string to find the lowest occupied cell (most significant set bit). The bin function also prepends '0b', so we have to subtract that. We also ignore the least significant bit. This is why the hexadecimal number is shift one bit to the left. This is to account for empty columns, whose string representations would have the same length as a column with only the top cell full (such as the T piece).
For example, the columns of the I tetromino, as mentioned earlier, are F, 0, and 0. bin(0xF) is '0b1111'. After ignoring the '0b', we have a length of 4, which is correct. But bin(0x0) is 0b0. After ignoring the '0b', we still have a length of' 1, which is incorrect. To account for this, we've added an additional bit to the end, so that we can ignore this insignificant bit. Hence, the +3 in the code is there to account for the extra length taken up by the '0b' at the beginning, and the insignificant bit at the end.
All of this occurs within a generator expression for three columns ((0,1,2)), and we take the min result to find the maximum number of rows the piece can drop before it touches in any of the three columns.
The rest should be pretty easy to understand by reading the code, but the for loop following these assignments adds the piece to the board. After this, the while loop removes full rows, replacing them with blank rows at the top, and tallies the score. At the end, the board and score are printed to the output.
Python, 298 chars
Beats all non-esoteric language solutions so far (Perl, Ruby, C, bash...)
... and does not even use code-zipping chicanery.
import os
r=os.read
b='[%11c\n'%']'*99+r(0,91)
for k,v in r(0,99).split():
t=map(ord,' -:G!.:; -:; !-.!"-. !". !./')['IJLOSTZ'.find(k)*4:][:4];v=int(v)-31
while'!'>max(b[v+j+13]for j in t):v+=13
for j in t:b=b[:v+j]+'#'+b[v+j+1:]
b=b.replace('[##########]\n','')
print b[-91:],1060-10*len(b)/13
On the test example
[ ]
[ ]
[ ]
[ ]
[ # # #]
[ ## ######]
[==========]
T2 Z6 I0 T7
it outputs
[ ]
[ ]
[ ]
[# ###]
[# ### ]
[##### ####]
[==========]
10
PS. fixed a bug pointed out by Nakilon at cost of +5
Golfscript 260 chars
I'm sure this could be improved, I'm kind of new to Golfscript.
[39 26.2/0:$14{.(}:?~1?15?1?14 2??27?13.!14?2?27?14 1]4/:t;n/)\n*:|;' '/-1%.,:c;~{)18+:&;'XIOZTLSJX'\%~;,1-t\={{.&+.90>{;.}*|\=32=!{&13-:&;}*}%}6*{&+}/|{\.#<'#'+\)|>+}4*{'['\10*']'++}:
;n/0\~n+:|;0\{.'#'
={;)}{n+|+:|;}if\.}do;' '
n+\.#*|+\$+:$;.,1-<:|;}c*|n?$*
End of lines are relevant (there shouldn't be one at the end). Anyway, here are some of the test cases I used:
> cat init.txt
[ ]
[ ]
[ ]
[ ]
[ # # #]
[ ## ######]
[==========]
T2 Z6 I0 T7> cat init.txt | ruby golfscript.rb tetris.gsc
[ ]
[ ]
[ ]
[# ###]
[# ### ]
[##### ####]
[==========]
10
> cat init.txt
[ ]
[ ]
[ ]
[ ]
[ # # #]
[ ## ##### ]
[==========]
I0 O7 Z1 S4> cat init.txt | ruby golfscript.rb tetris.gsc
[ ]
[ ]
[ ]
[# ]
[### #### ]
[### ##### ]
[==========]
10
> cat init.txt
[ ]
[ ]
[ ]
[ ## ### ]
[ # # ]
[ ## ######]
[==========]
T7 I0 I3> cat init.txt | ruby golfscript.rb tetris.gsc
[ ]
[ ]
[ ]
[ ]
[# # ]
[## # # # ]
[==========]
20
Note that there is no end of line in the input file, an end of line would break the script as is.
O'Caml 809 782 Chars
open String let w=length let c s=let x=ref 0in iter(fun k->if k='#'then incr x)s;!x open List let(#),g,s,p,q=nth,ref[],ref 0,(0,1),(0,2)let l=length let u=Printf.printf let rec o x i j=let a=map(fun s->copy s)!g in if snd(fold_left(fun(r,k)(p,l)->let z=c(a#r)in blit(make l '#')0(a#r)(i+p)l;if c(a#r)=z+l then r+1,k else r,false)(j-l x+1,true)x)then g:=a else o x i(j-1)and f x=let s=read_line()in if s.[1]='='then g:=rev x else f(sub s 1 10::x)let z=f [];read_line();;for i=0to w z/3 do o(assoc z.[i*3]['I',[p;p;p;p];'O',[q;q];'Z',[q;1,2];'T',[0,3;1,1];'L',[p;p;q];'S',[1,2;q];'J',[1,1;1,1;q]])(Char.code z.[i*3+1]-48)(l!g-1);let h=l!g in g:=filter(fun s->c s<>w s)!g;for i=1to h-(l!g)do incr s;g:=make 10' '::!g done;done;iter(fun r->u"[%s]\n"r)!g;u"[==========]\n";u"%d\n"(!s*10)
Common Lisp 667 657 645 Chars
My first attempt at code golf, so there are probably many tricks that I don't know yet. I left some newlines there to keep some residual "readability" (I counted newlines as 2 bytes, so removing 6 unnecessary newlines gains 12 more characters).
In input, first put the shapes then the field.
(let(b(s 0)m(e'(0 1 2 3 4 5 6 7 8 9)))
(labels((o(p i)(mapcar(lambda(j)(+ i j))p))(w(p r)(o p(* 13 r)))(f(i)(find i b))
(a(&aux(i(position(read-char)"IOZTLSJ")))(when i(push(o(nth i'((0 13 26 39)(0 1 13 14)(0 1 14 15)(0 1 2 14)(0 13 26 27)(1 2 13 14)(1 14 26 27)))(read))m)(a))))
(a)(dotimes(i 90)(if(find(read-char)"#=")(push i b)))(dolist(p(reverse m))
(setf b`(,#b,#(w p(1-(position-if(lambda(i)(some #'f(w p i)))e)))))
(dotimes(i 6)(when(every #'f(w e i))(setf s(1+ s)b(mapcar(lambda(k)(+(if(>(* 13 i)k)13(if(<=(* 13(1+ i))k)0 78))k))b)))))
(dotimes(i 6)(format t"[~{~:[ ~;#~]~}]
"(mapcar #'f(w e i))))(format t"[==========]
~a0"s)))
Testing
T2 Z6 I0 T7
[ ]
[ ]
[ ]
[ ]
[ # # #]
[ ## ######]
[==========]
[ ]
[ ]
[ ]
[# ###]
[# ### ]
[##### ####]
[==========]
10
NIL
Ruby 505 479 474 442 439 426 chars
A first attempt. Have done it with IronRuby. I'm sure it can be improved, but I really should get some work done today!
p,q,r,s=(0..9),(0..2),(0..6),0
t=[*$<]
f=p.map{|a|g=0;r.map{|b|g+=2**b if t[6-b][a+1]==?#};g}
t.pop.split.map{|x|w,y=[15,51,306,562,23,561,113]["IOZTLSJ"=~/#{x[0]}/],x[1].to_i
l=q.map{|d|r.inject{|b,c|f[d+y]&(w>>(d*4)&15-c+1)>0?c:b}}.max
q.map{|b|f[b+y]|=w>>(b*4)&15-l}
r.map{i=f.inject{|a,b|a&b};f.map!{|a|b=i^(i-1);a=((a&~b)>>1)+(a&(b>>1))};s+=i>0?10:0}}
p.map{|a|r.map{|b|t[6-b][a+1]=f[a]&2**b>0??#:' '}}
puts t,s
Testing
cat test.txt | ruby tetris.rb
[ ]
[ ]
[ ]
[ ]
[# ###]
[# ### ]
[##### ####]
[==========]
10
Edit
Now using normal ruby. Got the walls output..
Another one in Ruby, 573 546 characters:**
Z={I:?#*4,J:'#,###',L:'###,#',O:'##,##',S:'#,##, #',Z:' #,##,#',T:' #,##, #'}
t=[*$<]
R=->s{s.reverse}
T=->m{m.transpose}
a = T[R[t].join.scan /.#{'(\D)'*10}.$/]
t.pop.split.each{|z|
t,o=Z[z[0].to_sym].split(',').map{|x|x.split //},z[1].to_i
r=0..t.size-1
y=r.map{|u|1+a[o+u].rindex(?#).to_i-t[u].count(' ')}.max
(0..3).each{|i|r.each{|j|t[j][i]==?#&&a[o+j][y+i]=t[j][i]}}}
s=0
a.each{|x|s=a.max_by(&:size).size;x[s-=1]||=' 'while s>0}
a=R[T[a].reject{|x|x*''=~/[#]{10}/&&s+=10}.map{|x|?[+x*''+?]}[0..6]]
puts (0..8-a.size).map{?[+' '*10+?]},a,s
Testing:
cat test.txt | ruby 3858384_tetris.rb
[ ]
[ ]
[ ]
[ ]
[# ###]
[# ### ]
[##### ####]
[==========]
10