Related
I have a problem when I use VIPER with aracne networks.
I used regulonbrca in aracne to calculate protein activity,
The number of genes in regulonbrca is 6054.
And the number of genes in regulonbrca also in my TPM file is 5918.
But I got only 4506 genes as a viper result.
Why some genes are missing when running viper? Is there a default setting about it?
Here is my code.
expdat= "TPM-brca(entrez_avg).csv"
clsdat= "immunesubtype-BRCA.csv"
regulon= regulonbrca
dat0 = data.frame(read.csv(expdat))
dat1 = dat0[,-c(1,2)]
rownames(dat1) = dat0[,1]
cls0 = data.frame(read.csv(clsdat))
colnames(cls0) = c("id", "description")
cls1 = cls0 %>% dplyr::filter(id %in% colnames(dat1))
dat = dat1[,cls1$id]
cls = data.frame(description=cls1[,2])
rownames(cls) = cls1[,1]
meta = data.frame(labelDescription=c("description"), row.names = colnames(cls))
pheno = new("AnnotatedDataFrame", data=cls, varMetadata=meta)
dset = ExpressionSet(assayData = as.matrix(dat), phenoData = pheno)
signature = rowTtest(dset, "description", "TRUE", "FALSE")
signature = (qnorm(signature$p.value/2, lower.tail = FALSE) * + sign(signature$statistic))[, 1]
nullmodel = ttestNull(dset, "description", "TRUE", "FALSE", per = 1000, repos = TRUE, verbose = FALSE)
vpres = viper(dset, regulon, verbose = FALSE) ## single sample viper
res_ss0 = data.frame(id=row.names(vpres#assayData$exprs), vpres#assayData$exprs)
geneid_ss0 = ldply(sapply(row.names(res_ss0), converter), data.frame) ## gene mapping
colnames(geneid_ss0) = c('id', 'gene')
res_ss = merge(geneid_ss0, res_ss0, key='id')[, -1]
I have created a spacy transformer model for named entity recognition. Last time I trained till it reached 90% accuracy and I also have a model-best directory from where I can load my trained model for predictions. But now I have some more data samples and I wish to resume training this spacy transformer. I saw that we can do it by changing the config.cfg but clueless about 'what to change?'
This is my config.cfg after running python -m spacy init fill-config ./base_config.cfg ./config.cfg:
[paths]
train = null
dev = null
vectors = null
init_tok2vec = null
[system]
gpu_allocator = "pytorch"
seed = 0
[nlp]
lang = "en"
pipeline = ["transformer","ner"]
batch_size = 128
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
tokenizer = {"#tokenizers":"spacy.Tokenizer.v1"}
[components]
[components.ner]
factory = "ner"
incorrect_spans_key = null
moves = null
scorer = {"#scorers":"spacy.ner_scorer.v1"}
update_with_oracle_cut_size = 100
[components.ner.model]
#architectures = "spacy.TransitionBasedParser.v2"
state_type = "ner"
extra_state_tokens = false
hidden_width = 64
maxout_pieces = 2
use_upper = false
nO = null
[components.ner.model.tok2vec]
#architectures = "spacy-transformers.TransformerListener.v1"
grad_factor = 1.0
pooling = {"#layers":"reduce_mean.v1"}
upstream = "*"
[components.transformer]
factory = "transformer"
max_batch_items = 4096
set_extra_annotations = {"#annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
[components.transformer.model]
#architectures = "spacy-transformers.TransformerModel.v3"
name = "roberta-base"
mixed_precision = false
[components.transformer.model.get_spans]
#span_getters = "spacy-transformers.strided_spans.v1"
window = 128
stride = 96
[components.transformer.model.grad_scaler_config]
[components.transformer.model.tokenizer_config]
use_fast = true
[components.transformer.model.transformer_config]
[corpora]
[corpora.dev]
#readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null
[corpora.train]
#readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null
[training]
accumulate_gradient = 3
dev_corpus = "corpora.dev"
train_corpus = "corpora.train"
seed = ${system.seed}
gpu_allocator = ${system.gpu_allocator}
dropout = 0.1
patience = 1600
max_epochs = 0
max_steps = 20000
eval_frequency = 200
frozen_components = []
annotating_components = []
before_to_disk = null
[training.batcher]
#batchers = "spacy.batch_by_padded.v1"
discard_oversize = true
size = 2000
buffer = 256
get_length = null
[training.logger]
#loggers = "spacy.ConsoleLogger.v1"
progress_bar = false
[training.optimizer]
#optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = false
eps = 0.00000001
[training.optimizer.learn_rate]
#schedules = "warmup_linear.v1"
warmup_steps = 250
total_steps = 20000
initial_rate = 0.00005
[training.score_weights]
ents_f = 1.0
ents_p = 0.0
ents_r = 0.0
ents_per_type = null
[pretraining]
[initialize]
vectors = ${paths.vectors}
init_tok2vec = ${paths.init_tok2vec}
vocab_data = null
lookups = null
before_init = null
after_init = null
[initialize.components]
[initialize.tokenizer]
As you can see there is a 'vectors' parameter under [initialize] so I tried giving vectors from 'model-best' like this:
But it gave me this error
OSError: [E884] The pipeline could not be initialized because the vectors could not be found at './model-best/ner'. If your pipeline was already initialized/trained before, call 'resume_training' instead of 'initialize', or initialize only the components that are new.
For those who are wondering that I have been given the wrong path. No, that directory exists. You can see directory structure,
So, please guide me on how I can successfully resume the training from previous weights.
Thank you!
The vectors setting is not related to the transformer or what you're trying to do.
In the new config, you want to use the source option to load the components from the existing pipeline. You would modify the [component] blocks to contain only the source setting and no other settings:
[components.ner]
source = "/path/to/model-best"
[components.transformer]
source = "/path/to/model-best"
See: https://spacy.io/usage/training#config-components
Vector sizes refer to word vectors here. To use the vocabulary from the previously trained Spacy pipeline, you can use the following the code:
[components.ner]
source = "/path/to/model-best"
[initialize]
vectors = ${paths.vectors}
[initialize.before_init]
#callbacks: "spacy.copy_from_base_model.v1"
tokenizer: "/path/to/model-best"
vocab: "/path/to/model-best"
I am going mad with doxygen at the moment, i hope it is not too stupid on layer 8, but we will see...
I want to extract static and undocumented stuff. That's why EXTRACT_ALL = YES for now. But EXTRACT_ALL seems to be not working as the documentation intents. See Example below.
Against the documentation of EXTRACT_ALL there are still warnings in the output.
# If the EXTRACT_ALL tag is set to YES, doxygen will assume all entities in
# documentation are documented, even if no documentation was available. Private
# class members and static file members will be hidden unless the
# EXTRACT_PRIVATE respectively EXTRACT_STATIC tags are set to YES.
# Note: This will also disable the warnings about undocumented members that are
# normally produced when WARNINGS is set to YES.
# The default value is: NO.
from the doxygen_output:
Generating docs for file test_main.c...
C:/Users/xxx/src/test/test_main.c:11: warning: Member variable_non_static (variable) of file test_main.c is not documented.
Also the EXTRACT_STATIC = YES is set, so static variables should be exported, but they are not.
test_main.c
/**
* #file test_main.c
* #brief
* #version ver. xxx
* #date 2020-08-17
*
* #copyright Copyright (c) 2020
*
*/
int variable_non_static = 1;
static int variable_static = 0;
Results do not show static variables
used config:
# Doxyfile 1.8.19
#---------------------------------------------------------------------------
# Project related configuration options
#---------------------------------------------------------------------------
DOXYFILE_ENCODING = UTF-8
PROJECT_NAME = XXX
PROJECT_NUMBER =
PROJECT_BRIEF =
PROJECT_LOGO =
OUTPUT_DIRECTORY = C:\Users\xxx
CREATE_SUBDIRS = NO
ALLOW_UNICODE_NAMES = NO
OUTPUT_LANGUAGE = English
OUTPUT_TEXT_DIRECTION = None
BRIEF_MEMBER_DESC = YES
REPEAT_BRIEF = YES
ABBREVIATE_BRIEF = "The $name class" \
"The $name widget" \
"The $name file" \
is \
provides \
specifies \
contains \
represents \
a \
an \
the
ALWAYS_DETAILED_SEC = NO
INLINE_INHERITED_MEMB = NO
FULL_PATH_NAMES = YES
STRIP_FROM_PATH =
STRIP_FROM_INC_PATH =
SHORT_NAMES = NO
JAVADOC_AUTOBRIEF = NO
JAVADOC_BANNER = NO
QT_AUTOBRIEF = NO
MULTILINE_CPP_IS_BRIEF = NO
INHERIT_DOCS = YES
SEPARATE_MEMBER_PAGES = NO
TAB_SIZE = 4
ALIASES =
OPTIMIZE_OUTPUT_FOR_C = YES
OPTIMIZE_OUTPUT_JAVA = NO
OPTIMIZE_FOR_FORTRAN = NO
OPTIMIZE_OUTPUT_VHDL = NO
OPTIMIZE_OUTPUT_SLICE = NO
EXTENSION_MAPPING =
MARKDOWN_SUPPORT = YES
TOC_INCLUDE_HEADINGS = 5
AUTOLINK_SUPPORT = YES
BUILTIN_STL_SUPPORT = NO
CPP_CLI_SUPPORT = NO
SIP_SUPPORT = NO
IDL_PROPERTY_SUPPORT = YES
DISTRIBUTE_GROUP_DOC = NO
GROUP_NESTED_COMPOUNDS = NO
SUBGROUPING = YES
INLINE_GROUPED_CLASSES = NO
INLINE_SIMPLE_STRUCTS = NO
TYPEDEF_HIDES_STRUCT = NO
LOOKUP_CACHE_SIZE = 0
NUM_PROC_THREADS = 1
#---------------------------------------------------------------------------
# Build related configuration options
#---------------------------------------------------------------------------
EXTRACT_ALL = YES
EXTRACT_PRIVATE = YES
EXTRACT_PRIV_VIRTUAL = YES
EXTRACT_PACKAGE = YES
EXTRACT_STATIC = YES
EXTRACT_LOCAL_CLASSES = YES
EXTRACT_LOCAL_METHODS = YES
EXTRACT_ANON_NSPACES = YES
HIDE_UNDOC_MEMBERS = NO
HIDE_UNDOC_CLASSES = NO
HIDE_FRIEND_COMPOUNDS = NO
HIDE_IN_BODY_DOCS = NO
INTERNAL_DOCS = NO
CASE_SENSE_NAMES = NO
HIDE_SCOPE_NAMES = YES
HIDE_COMPOUND_REFERENCE= NO
SHOW_INCLUDE_FILES = YES
SHOW_GROUPED_MEMB_INC = NO
FORCE_LOCAL_INCLUDES = NO
INLINE_INFO = YES
SORT_MEMBER_DOCS = YES
SORT_BRIEF_DOCS = NO
SORT_MEMBERS_CTORS_1ST = NO
SORT_GROUP_NAMES = NO
SORT_BY_SCOPE_NAME = NO
STRICT_PROTO_MATCHING = NO
GENERATE_TODOLIST = YES
GENERATE_TESTLIST = YES
GENERATE_BUGLIST = YES
GENERATE_DEPRECATEDLIST= YES
ENABLED_SECTIONS =
MAX_INITIALIZER_LINES = 30
SHOW_USED_FILES = YES
SHOW_FILES = YES
SHOW_NAMESPACES = YES
FILE_VERSION_FILTER =
LAYOUT_FILE =
CITE_BIB_FILES =
#---------------------------------------------------------------------------
# Configuration options related to warning and progress messages
#---------------------------------------------------------------------------
QUIET = NO
WARNINGS = YES
WARN_IF_UNDOCUMENTED = YES
WARN_IF_DOC_ERROR = YES
WARN_NO_PARAMDOC = NO
WARN_AS_ERROR = NO
WARN_FORMAT = "$file:$line: $text"
WARN_LOGFILE =
#---------------------------------------------------------------------------
# Configuration options related to the input files
#---------------------------------------------------------------------------
INPUT = C:\Users\xxx\src
INPUT_ENCODING = UTF-8
FILE_PATTERNS = *.c \
*.cc \
*.cxx \
*.cpp \
*.c++ \
*.java \
*.ii \
*.ixx \
*.ipp \
*.i++ \
*.inl \
*.idl \
*.ddl \
*.odl \
*.h \
*.hh \
*.hxx \
*.hpp \
*.h++ \
*.cs \
*.d \
*.php \
*.php4 \
*.php5 \
*.phtml \
*.inc \
*.m \
*.markdown \
*.md \
*.mm \
*.dox \
*.doc \
*.txt \
*.py \
*.pyw \
*.f90 \
*.f95 \
*.f03 \
*.f08 \
*.f18 \
*.f \
*.for \
*.vhd \
*.vhdl \
*.ucf \
*.qsf \
*.ice
RECURSIVE = NO
EXCLUDE =
EXCLUDE_SYMLINKS = NO
EXCLUDE_PATTERNS =
EXCLUDE_SYMBOLS =
EXAMPLE_PATH =
EXAMPLE_PATTERNS = *
EXAMPLE_RECURSIVE = NO
IMAGE_PATH =
INPUT_FILTER =
FILTER_PATTERNS =
FILTER_SOURCE_FILES = NO
FILTER_SOURCE_PATTERNS =
USE_MDFILE_AS_MAINPAGE =
#---------------------------------------------------------------------------
# Configuration options related to source browsing
#---------------------------------------------------------------------------
SOURCE_BROWSER = YES
INLINE_SOURCES = NO
STRIP_CODE_COMMENTS = YES
REFERENCED_BY_RELATION = NO
REFERENCES_RELATION = NO
REFERENCES_LINK_SOURCE = YES
SOURCE_TOOLTIPS = YES
USE_HTAGS = NO
VERBATIM_HEADERS = YES
CLANG_ASSISTED_PARSING = NO
CLANG_OPTIONS =
CLANG_DATABASE_PATH =
#---------------------------------------------------------------------------
# Configuration options related to the alphabetical class index
#---------------------------------------------------------------------------
ALPHABETICAL_INDEX = YES
COLS_IN_ALPHA_INDEX = 5
IGNORE_PREFIX =
#---------------------------------------------------------------------------
# Configuration options related to the HTML output
#---------------------------------------------------------------------------
GENERATE_HTML = YES
HTML_OUTPUT = html
HTML_FILE_EXTENSION = .html
HTML_HEADER =
HTML_FOOTER =
HTML_STYLESHEET =
HTML_EXTRA_STYLESHEET =
HTML_EXTRA_FILES =
HTML_COLORSTYLE_HUE = 220
HTML_COLORSTYLE_SAT = 100
HTML_COLORSTYLE_GAMMA = 80
HTML_TIMESTAMP = NO
HTML_DYNAMIC_MENUS = YES
HTML_DYNAMIC_SECTIONS = NO
HTML_INDEX_NUM_ENTRIES = 100
GENERATE_DOCSET = NO
DOCSET_FEEDNAME = "Doxygen generated docs"
DOCSET_BUNDLE_ID = org.doxygen.Project
DOCSET_PUBLISHER_ID = org.doxygen.Publisher
DOCSET_PUBLISHER_NAME = Publisher
GENERATE_HTMLHELP = NO
CHM_FILE =
HHC_LOCATION =
GENERATE_CHI = NO
CHM_INDEX_ENCODING =
BINARY_TOC = NO
TOC_EXPAND = NO
GENERATE_QHP = NO
QCH_FILE =
QHP_NAMESPACE = org.doxygen.Project
QHP_VIRTUAL_FOLDER = doc
QHP_CUST_FILTER_NAME =
QHP_CUST_FILTER_ATTRS =
QHP_SECT_FILTER_ATTRS =
QHG_LOCATION =
GENERATE_ECLIPSEHELP = NO
ECLIPSE_DOC_ID = org.doxygen.Project
DISABLE_INDEX = NO
GENERATE_TREEVIEW = YES
ENUM_VALUES_PER_LINE = 4
TREEVIEW_WIDTH = 250
EXT_LINKS_IN_WINDOW = NO
HTML_FORMULA_FORMAT = png
FORMULA_FONTSIZE = 10
FORMULA_TRANSPARENT = YES
FORMULA_MACROFILE =
USE_MATHJAX = NO
MATHJAX_FORMAT = HTML-CSS
MATHJAX_RELPATH = https://cdn.jsdelivr.net/npm/mathjax#2
MATHJAX_EXTENSIONS =
MATHJAX_CODEFILE =
SEARCHENGINE = YES
SERVER_BASED_SEARCH = NO
EXTERNAL_SEARCH = NO
SEARCHENGINE_URL =
SEARCHDATA_FILE = searchdata.xml
EXTERNAL_SEARCH_ID =
EXTRA_SEARCH_MAPPINGS =
#---------------------------------------------------------------------------
# Configuration options related to the LaTeX output
#---------------------------------------------------------------------------
GENERATE_LATEX = YES
LATEX_OUTPUT = latex
LATEX_CMD_NAME =
MAKEINDEX_CMD_NAME = makeindex
LATEX_MAKEINDEX_CMD = makeindex
COMPACT_LATEX = NO
PAPER_TYPE = a4
EXTRA_PACKAGES =
LATEX_HEADER =
LATEX_FOOTER =
LATEX_EXTRA_STYLESHEET =
LATEX_EXTRA_FILES =
PDF_HYPERLINKS = YES
USE_PDFLATEX = YES
LATEX_BATCHMODE = NO
LATEX_HIDE_INDICES = NO
LATEX_SOURCE_CODE = NO
LATEX_BIB_STYLE = plain
LATEX_TIMESTAMP = NO
LATEX_EMOJI_DIRECTORY =
#---------------------------------------------------------------------------
# Configuration options related to the RTF output
#---------------------------------------------------------------------------
GENERATE_RTF = NO
RTF_OUTPUT = rtf
COMPACT_RTF = NO
RTF_HYPERLINKS = NO
RTF_STYLESHEET_FILE =
RTF_EXTENSIONS_FILE =
RTF_SOURCE_CODE = NO
#---------------------------------------------------------------------------
# Configuration options related to the man page output
#---------------------------------------------------------------------------
GENERATE_MAN = NO
MAN_OUTPUT = man
MAN_EXTENSION = .3
MAN_SUBDIR =
MAN_LINKS = NO
#---------------------------------------------------------------------------
# Configuration options related to the XML output
#---------------------------------------------------------------------------
GENERATE_XML = NO
XML_OUTPUT = xml
XML_PROGRAMLISTING = YES
XML_NS_MEMB_FILE_SCOPE = NO
#---------------------------------------------------------------------------
# Configuration options related to the DOCBOOK output
#---------------------------------------------------------------------------
GENERATE_DOCBOOK = NO
DOCBOOK_OUTPUT = docbook
DOCBOOK_PROGRAMLISTING = NO
#---------------------------------------------------------------------------
# Configuration options for the AutoGen Definitions output
#---------------------------------------------------------------------------
GENERATE_AUTOGEN_DEF = NO
#---------------------------------------------------------------------------
# Configuration options related to Sqlite3 output
#---------------------------------------------------------------------------
#---------------------------------------------------------------------------
# Configuration options related to the Perl module output
#---------------------------------------------------------------------------
GENERATE_PERLMOD = NO
PERLMOD_LATEX = NO
PERLMOD_PRETTY = YES
PERLMOD_MAKEVAR_PREFIX =
#---------------------------------------------------------------------------
# Configuration options related to the preprocessor
#---------------------------------------------------------------------------
ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = NO
EXPAND_ONLY_PREDEF = NO
SEARCH_INCLUDES = YES
INCLUDE_PATH = "xxx"
INCLUDE_FILE_PATTERNS =
PREDEFINED =
EXPAND_AS_DEFINED =
SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
# Configuration options related to external references
#---------------------------------------------------------------------------
TAGFILES =
GENERATE_TAGFILE =
ALLEXTERNALS = NO
EXTERNAL_GROUPS = YES
EXTERNAL_PAGES = YES
#---------------------------------------------------------------------------
# Configuration options related to the dot tool
#---------------------------------------------------------------------------
CLASS_DIAGRAMS = YES
DIA_PATH =
HIDE_UNDOC_RELATIONS = YES
HAVE_DOT = YES
DOT_NUM_THREADS = 0
DOT_FONTNAME = Helvetica
DOT_FONTSIZE = 10
DOT_FONTPATH =
CLASS_GRAPH = YES
COLLABORATION_GRAPH = YES
GROUP_GRAPHS = YES
UML_LOOK = NO
UML_LIMIT_NUM_FIELDS = 10
TEMPLATE_RELATIONS = NO
INCLUDE_GRAPH = YES
INCLUDED_BY_GRAPH = YES
CALL_GRAPH = YES
CALLER_GRAPH = YES
GRAPHICAL_HIERARCHY = YES
DIRECTORY_GRAPH = YES
DOT_IMAGE_FORMAT = png
INTERACTIVE_SVG = NO
DOT_PATH =
DOTFILE_DIRS =
MSCFILE_DIRS =
DIAFILE_DIRS =
PLANTUML_JAR_PATH =
PLANTUML_CFG_FILE =
PLANTUML_INCLUDE_PATH =
DOT_GRAPH_MAX_NODES = 50
MAX_DOT_GRAPH_DEPTH = 0
DOT_TRANSPARENT = NO
DOT_MULTI_TARGETS = NO
GENERATE_LEGEND = YES
DOT_CLEANUP = YES
OP uses the 1.8.19 doxygen wizard and here is a small problem https://github.com/doxygen/doxygen/issues/7951.
This means that doxygen 1.8.19 cannot be started from doxygen wizard but you have to use the command line to run doxygen.
EDIT August 24, 2020: A new doxygen release 1.8.20 is available where this problem has been fixed
I need to read data in my query with utf8 format, I tried to change collation of my SQL database when I read data base on English alphabet every thing good, but I have trouble in Arabic or other languages.
I print a string stored in variable came from in mysql query and show me like this ???????
how I can solve this problem to show them correct?
After retrieving UTF-8 strings from database, you should manually convert them to CP1256.
You can use function str:fromutf8() defined below
local char, byte, pairs, floor = string.char, string.byte, pairs, math.floor
local table_insert, table_concat = table.insert, table.concat
local unpack = table.unpack or unpack
local function unicode_to_utf8(code)
-- converts numeric UTF code (U+code) to UTF-8 string
local t, h = {}, 128
while code >= h do
t[#t+1] = 128 + code%64
code = floor(code/64)
h = h > 32 and 32 or h/2
end
t[#t+1] = 256 - 2*h + code
return char(unpack(t)):reverse()
end
local function utf8_to_unicode(utf8str, pos)
-- pos = starting byte position inside input string (default 1)
pos = pos or 1
local code, size = utf8str:byte(pos), 1
if code >= 0xC0 and code < 0xFE then
local mask = 64
code = code - 128
repeat
local next_byte = utf8str:byte(pos + size) or 0
if next_byte >= 0x80 and next_byte < 0xC0 then
code, size = (code - mask - 2) * 64 + next_byte, size + 1
else
code, size = utf8str:byte(pos), 1
end
mask = mask * 32
until code < mask
end
-- returns code, number of bytes in this utf8 char
return code, size
end
local map_1256_to_unicode = {
[0x80] = 0x20AC,
[0x81] = 0x067E,
[0x82] = 0x201A,
[0x83] = 0x0192,
[0x84] = 0x201E,
[0x85] = 0x2026,
[0x86] = 0x2020,
[0x87] = 0x2021,
[0x88] = 0x02C6,
[0x89] = 0x2030,
[0x8A] = 0x0679,
[0x8B] = 0x2039,
[0x8C] = 0x0152,
[0x8D] = 0x0686,
[0x8E] = 0x0698,
[0x8F] = 0x0688,
[0x90] = 0x06AF,
[0x91] = 0x2018,
[0x92] = 0x2019,
[0x93] = 0x201C,
[0x94] = 0x201D,
[0x95] = 0x2022,
[0x96] = 0x2013,
[0x97] = 0x2014,
[0x98] = 0x06A9,
[0x99] = 0x2122,
[0x9A] = 0x0691,
[0x9B] = 0x203A,
[0x9C] = 0x0153,
[0x9D] = 0x200C,
[0x9E] = 0x200D,
[0x9F] = 0x06BA,
[0xA0] = 0x00A0,
[0xA1] = 0x060C,
[0xA2] = 0x00A2,
[0xA3] = 0x00A3,
[0xA4] = 0x00A4,
[0xA5] = 0x00A5,
[0xA6] = 0x00A6,
[0xA7] = 0x00A7,
[0xA8] = 0x00A8,
[0xA9] = 0x00A9,
[0xAA] = 0x06BE,
[0xAB] = 0x00AB,
[0xAC] = 0x00AC,
[0xAD] = 0x00AD,
[0xAE] = 0x00AE,
[0xAF] = 0x00AF,
[0xB0] = 0x00B0,
[0xB1] = 0x00B1,
[0xB2] = 0x00B2,
[0xB3] = 0x00B3,
[0xB4] = 0x00B4,
[0xB5] = 0x00B5,
[0xB6] = 0x00B6,
[0xB7] = 0x00B7,
[0xB8] = 0x00B8,
[0xB9] = 0x00B9,
[0xBA] = 0x061B,
[0xBB] = 0x00BB,
[0xBC] = 0x00BC,
[0xBD] = 0x00BD,
[0xBE] = 0x00BE,
[0xBF] = 0x061F,
[0xC0] = 0x06C1,
[0xC1] = 0x0621,
[0xC2] = 0x0622,
[0xC3] = 0x0623,
[0xC4] = 0x0624,
[0xC5] = 0x0625,
[0xC6] = 0x0626,
[0xC7] = 0x0627,
[0xC8] = 0x0628,
[0xC9] = 0x0629,
[0xCA] = 0x062A,
[0xCB] = 0x062B,
[0xCC] = 0x062C,
[0xCD] = 0x062D,
[0xCE] = 0x062E,
[0xCF] = 0x062F,
[0xD0] = 0x0630,
[0xD1] = 0x0631,
[0xD2] = 0x0632,
[0xD3] = 0x0633,
[0xD4] = 0x0634,
[0xD5] = 0x0635,
[0xD6] = 0x0636,
[0xD7] = 0x00D7,
[0xD8] = 0x0637,
[0xD9] = 0x0638,
[0xDA] = 0x0639,
[0xDB] = 0x063A,
[0xDC] = 0x0640,
[0xDD] = 0x0641,
[0xDE] = 0x0642,
[0xDF] = 0x0643,
[0xE0] = 0x00E0,
[0xE1] = 0x0644,
[0xE2] = 0x00E2,
[0xE3] = 0x0645,
[0xE4] = 0x0646,
[0xE5] = 0x0647,
[0xE6] = 0x0648,
[0xE7] = 0x00E7,
[0xE8] = 0x00E8,
[0xE9] = 0x00E9,
[0xEA] = 0x00EA,
[0xEB] = 0x00EB,
[0xEC] = 0x0649,
[0xED] = 0x064A,
[0xEE] = 0x00EE,
[0xEF] = 0x00EF,
[0xF0] = 0x064B,
[0xF1] = 0x064C,
[0xF2] = 0x064D,
[0xF3] = 0x064E,
[0xF4] = 0x00F4,
[0xF5] = 0x064F,
[0xF6] = 0x0650,
[0xF7] = 0x00F7,
[0xF8] = 0x0651,
[0xF9] = 0x00F9,
[0xFA] = 0x0652,
[0xFB] = 0x00FB,
[0xFC] = 0x00FC,
[0xFD] = 0x200E,
[0xFE] = 0x200F,
[0xFF] = 0x06D2,
}
local map_unicode_to_1256 = {}
for code1256, code in pairs(map_1256_to_unicode) do
map_unicode_to_1256[code] = code1256
end
function string.fromutf8(utf8str)
local pos, result_1256 = 1, {}
while pos <= #utf8str do
local code, size = utf8_to_unicode(utf8str, pos)
pos = pos + size
code = code < 128 and code or map_unicode_to_1256[code] or ('?'):byte()
table_insert(result_1256, char(code))
end
return table_concat(result_1256)
end
function string.toutf8(str1256)
local result_utf8 = {}
for pos = 1, #str1256 do
local code = str1256:byte(pos)
table_insert(result_utf8, unicode_to_utf8(map_1256_to_unicode[code] or code))
end
return table_concat(result_utf8)
end
Usage is:
str:fromutf8() -- to convert from UTF-8 to cp1256
str:toutf8() -- to convert from cp1256 to UTF-8
Example:
-- This is cp1256 string
local str1256 = "1\128" -- "one euro" in cp1256
-- Convert it to UTF-8
local str_utf8 = str1256:toutf8() -- "1\226\130\172" -- one euro in utf-8
-- Convert it back from UTF-8 to cp1256
local str1256_2 = str_utf8:fromutf8()
The acc gyro in model.fit is (200 * 3),in the Input layer shape is (200 * 3). Why is there such a problem? Error when checking input: expected acc_input to have 3 dimensions, but got array with shape (200, 3).This is a visualization of my model.
Here's my code:
WIDE = 20
FEATURE_DIM = 30
CHANNEL = 1
CONV_NUM = 64
CONV_LEN = 3
CONV_LEN_INTE = 3#4
CONV_LEN_LAST = 3#5
CONV_NUM2 = 64
CONV_MERGE_LEN = 8
CONV_MERGE_LEN2 = 6
CONV_MERGE_LEN3 = 4
rnn_size=128
acc_input_tensor = Input(shape=(200,3),name = 'acc_input')
gyro_input_tensor = Input(shape=(200,3),name= 'gyro_input')
Acc_input_tensor = Reshape(target_shape=(20,30,1))(acc_input_tensor)
Gyro_input_tensor = Reshape(target_shape=(20,30,1))(gyro_input_tensor)
acc_conv1 = Conv2D(CONV_NUM,(1, 1*3*CONV_LEN),strides= (1,1*3),padding='valid',activation=None)(Acc_input_tensor)
acc_conv1 = BatchNormalization(axis=1)(acc_conv1)
acc_conv1 = Activation('relu')(acc_conv1)
acc_conv1 = Dropout(0.2)(acc_conv1)
acc_conv2 = Conv2D(CONV_NUM,(1,CONV_LEN_INTE),strides= (1,1),padding='valid',activation=None)(acc_conv1)
acc_conv2 = BatchNormalization(axis=1)(acc_conv2)
acc_conv2 = Activation('relu')(acc_conv2)
acc_conv2 = Dropout(0.2)(acc_conv2)
acc_conv3 = Conv2D(CONV_NUM,(1,CONV_LEN_LAST),strides=(1,1),padding='valid',activation=None)(acc_conv2)
acc_conv3 = BatchNormalization(axis=1)(acc_conv3)
acc_conv3 = Activation('relu')(acc_conv3)
acc_conv3 = Dropout(0.2)(acc_conv3)
gyro_conv1 = Conv2D(CONV_NUM,(1, 1*3*CONV_LEN),strides=(1,1*3),padding='valid',activation=None)(Gyro_input_tensor)
gyro_conv1 = BatchNormalization(axis=1)(gyro_conv1)
gyro_conv1 = Activation('relu')(gyro_conv1)
gyro_conv1 = Dropout(0.2)(gyro_conv1)
gyro_conv2 = Conv2D(CONV_NUM,(1, CONV_LEN_INTE),strides=(1,1),padding='valid',activation=None)(gyro_conv1)
gyro_conv2 = BatchNormalization(axis=1)(gyro_conv2)
gyro_conv2 = Activation('relu')(gyro_conv2)
gyro_conv2 = Dropout(0.2)(gyro_conv2)
gyro_conv3 = Conv2D(CONV_NUM,(1, CONV_LEN_LAST),strides=(1,1),padding='valid',activation=None)(gyro_conv2)
gyro_conv3 = BatchNormalization(axis=1)(gyro_conv3)
gyro_conv3 = Activation('relu')(gyro_conv3)
gyro_conv3 = Dropout(0.2)(gyro_conv3)
sensor_conv_in = concatenate([acc_conv3, gyro_conv3], 2)
sensor_conv_in = Dropout(0.2)(sensor_conv_in)
sensor_conv1 = Conv2D(CONV_NUM2,kernel_size=(2, CONV_MERGE_LEN),padding='SAME')(sensor_conv_in)
sensor_conv1 = BatchNormalization(axis=1)(sensor_conv1)
sensor_conv1 = Activation('relu')(sensor_conv1)
sensor_conv1 = Dropout(0.2)(sensor_conv1)
sensor_conv2 = Conv2D(CONV_NUM2,kernel_size=(2, CONV_MERGE_LEN2),padding='SAME')(sensor_conv1)
sensor_conv2 = BatchNormalization(axis=1)(sensor_conv2)
sensor_conv2 = Activation('relu')(sensor_conv2)
sensor_conv2 = Dropout(0.2)(sensor_conv2)
sensor_conv3 = Conv2D(CONV_NUM2,kernel_size=(2, CONV_MERGE_LEN3),padding='SAME')(sensor_conv2)
sensor_conv3 = BatchNormalization(axis=1)(sensor_conv3)
sensor_conv3 = Activation('relu')(sensor_conv3)
conv_shape = sensor_conv3.get_shape()
print conv_shape
x1 = Reshape(target_shape=(int(conv_shape[1]), int(conv_shape[2]*conv_shape[3])))(sensor_conv3)
x1 = Dense(64, activation='relu')(x1)
gru_1 = GRU(rnn_size, return_sequences=True, init='he_normal', name='gru1')(x1)
gru_1b = GRU(rnn_size, return_sequences=True, go_backwards=True, init='he_normal', name='gru1_b')(x1)
gru1_merged = merge([gru_1, gru_1b], mode='sum')
gru_2 = GRU(rnn_size, return_sequences=True, init='he_normal', name='gru2')(gru1_merged)
gru_2b = GRU(rnn_size, return_sequences=True, go_backwards=True, init='he_normal', name='gru2_b')(gru1_merged)
x = merge([gru_2, gru_2b], mode='concat')
x = Dropout(0.25)(x)
n_class=2
x = Dense(n_class)(x)
model = Model(input=[acc_input_tensor,gyro_input_tensor], output=x)
model.compile(loss='mean_squared_error',optimizer='adam')
model.fit(inputs=[acc,gyro],outputs=labels,batch_size=1, validation_split=0.2, epochs=2,verbose=1 ,
shuffle=False)
The acc gyro in model.fit is (200 * 3),in the Input layer shape is (200 * 3). Why is there such a problem? Error when checking input: expected acc_input to have 3 dimensions, but got array with shape (200, 3)
Shape (None, 200, 3) is used in Keras for batches, None means batch_size, because in the time of creating or reshaping input arrays, the batch size might be unknown, so if you will be using batch_size = 128 your batch input matrix will have shape (128, 200, 3)