-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding dropout-by row #8
base: dropout_schedule
Are you sure you want to change the base?
adding dropout-by row #8
Conversation
src/nnet3/nnet-simple-component.cc
Outdated
@@ -154,6 +186,8 @@ void DropoutComponent::Read(std::istream &is, bool binary) { | |||
ReadBasicType(is, binary, &dim_); | |||
ExpectToken(is, binary, "<DropoutProportion>"); | |||
ReadBasicType(is, binary, &dropout_proportion_); | |||
ExpectToken(is, binary, "<DropoutPerFrame>"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this backcompatible. Change this to ReadToken and then add an if condition to check which token is present.
See other components where ReadToken is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GaofengCheng, you need to understand what Vimal was saying here- there needs to be back compatibility code for the old format. Search for ReadToken() in the file for examples.
However, the reason for your error is that you need to recompile in 'chainbin/' (and possibly chain/').
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. and, of course, make this back compatible.
src/nnet3/nnet-utils.h
Outdated
@@ -233,7 +233,7 @@ void FindOrphanNodes(const Nnet &nnet, std::vector<int32> *nodes); | |||
remove internal nodes directly; instead you should use the command | |||
'remove-orphans'. | |||
|
|||
set-dropout-proportion [name=<name-pattern>] proportion=<dropout-proportion> | |||
set-dropout-proportion [name=<name-pattern>] proportion=<dropout-proportion> perframe=<perframe> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be made per-frame.
src/nnet3/nnet-utils.cc
Outdated
if (!config_line.GetValue("proportion", &proportion)) { | ||
KALDI_ERR << "In edits-config, expected proportion to be set in line: " | ||
<< config_line.WholeLine(); | ||
} | ||
if (!config_line.GetValue("perframe", &perframe)) { | ||
perframe = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per_frame is more appropriate name
src/nnet3/nnet-simple-component.cc
Outdated
@@ -119,16 +131,36 @@ void DropoutComponent::Propagate(const ComponentPrecomputedIndexes *indexes, | |||
|
|||
BaseFloat dropout = dropout_proportion_; | |||
KALDI_ASSERT(dropout >= 0.0 && dropout <= 1.0); | |||
if(dropout_per_frame_ == true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (dropout_per_frame_) {
src/nnet3/nnet-simple-component.cc
Outdated
out->ApplyHeaviside(); // apply the function (x>0?1:0). Now, a proportion "dropout" will | ||
// be zero and (1 - dropout) will be 1.0. | ||
out->MulElements(in); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} else {
is used everywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change every other place too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danpovey using CopyColFromVec
to realize matrix random by row, the speed is rarely slow, any way to accelerate this? hope to get some instrction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very slow...
src/nnet3/nnet-utils.cc
Outdated
@@ -524,12 +524,14 @@ std::string NnetInfo(const Nnet &nnet) { | |||
} | |||
|
|||
void SetDropoutProportion(BaseFloat dropout_proportion, | |||
bool dropout_per_frame , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No space before ,
src/nnet3/nnet-utils.cc
Outdated
Nnet *nnet) { | ||
dropout_per_frame = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the input to the function ignored?
src/cudamatrix/cu-kernels.cu
Outdated
mat[index] = mat[index-d.stride-d.cols]; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danpovey I think there may exist some problem:
LOG (nnet3-chain-train:UpdateParamsWithMaxChange():nnet-chain-training.cc:225) Per-component max-change active on 19 / 35 Updatable Components.(smallest factor=0.223659 on tdnn2.affine with max-change=0.75). Global max-change factor was 0.495221 with max-change=2.
ERROR (nnet3-chain-train:MulElements():cu-matrix.cc:665) cudaError_t 77 : "an illegal memory access was encountered" returned from 'cudaGetLastError()'
[ Stack-Trace: ]
nnet3-chain-train() [0xb78566]
kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
kaldi::CuMatrixBase<float>::MulElements(kaldi::CuMatrixBase<float> const&)
kaldi::nnet3::DropoutComponent::Propagate(kaldi::nnet3::ComponentPrecomputedIndexes const*, kaldi::CuMatrixBase<float> const&, kaldi::CuMatrixBase<float>*) const
kaldi::nnet3::NnetComputer::ExecuteCommand(int)
kaldi::nnet3::NnetComputer::Forward()
kaldi::nnet3::NnetChainTrainer::Train(kaldi::nnet3::NnetChainExample const&)
main
__libc_start_main
nnet3-chain-train() [0x7d23c9]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably because of not using ==.
I think your code will not work in any case. It may require some sync_threads after first setting the values when j == j_tempt
and then copy them for j != j_tempt.
src/cudamatrix/cu-kernels.cu
Outdated
static void _apply_heaviside_by_row(Real* mat, MatrixDim d) { | ||
int i = blockIdx.x * blockDim.x + threadIdx.x; // col index | ||
int j = blockIdx.y * blockDim.y + threadIdx.y; // row index | ||
int j_tempt = blockIdx.y * blockDim.y + threadIdx.y; // row index using to control setting heavyside() in the first rows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want to get the 0th row or something? You have given the same expression as j.
src/cudamatrix/cu-kernels.cu
Outdated
int j_tempt = blockIdx.y * blockDim.y + threadIdx.y; // row index using to control setting heavyside() in the first rows | ||
int index = i + j * d.stride; | ||
if (i < d.cols && j < d.rows) | ||
if (j = j_tempt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
==
src/cudamatrix/cu-kernels.cu
Outdated
mat[index] = mat[index-d.stride-d.cols]; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably because of not using ==.
I think your code will not work in any case. It may require some sync_threads after first setting the values when j == j_tempt
and then copy them for j != j_tempt.
@@ -771,6 +771,11 @@ def __init__(self): | |||
lstm*=0,0.2,0'. More general should precede | |||
less general patterns, as they are applied | |||
sequentially.""") | |||
self.parser.add_argument("--trainer.dropout-per-frame", type=str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this option required? Do you expect to change whether dropout is per frame or not during the training iterations?
I think dropout-per-frame should only be at the config level.
Also I think you can remove dropout_per_frame from the function SetDropoutProportion
, because that is something you would have already defined from the config. If you really need to change dropout-per-frame during training, I suggest add a separate function like SetDropoutPerFrame to the DropoutComponent.
Regarding the way to specify the per-frame dropout, I agree with Vimal that
this should be at the config level.
BTW, you could consider (eventually, maybe) a combination of per-frame and
regular dropout, with a config like dropout-per-frame-proportion=x
with 0 <= x <= 1, and then you use both types of dropout.
@GaofengCheng, I think you misunderstood what I said.
You say " CopyColFromVec to realize matrix random by row, the speed is
really slow". I said CopyColsFromVec-- notice the s. It is a different
function, and it will be fast. I think a lot of the code changes you made
might be unnecessary.
…On Sat, Dec 17, 2016 at 11:52 AM, Vimal Manohar ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In egs/wsj/s5/steps/libs/nnet3/train/common.py
<#8 (review)>:
> @@ -771,6 +771,11 @@ def __init__(self):
lstm*=0,0.2,0'. More general should precede
less general patterns, as they are applied
sequentially.""")
+ self.parser.add_argument("--trainer.dropout-per-frame", type=str,
Is this option required? Do you expect to change whether dropout is per
frame or not during the training iterations?
I think dropout-per-frame should only be at the config level.
Also I think you can remove dropout_per_frame from the function
SetDropoutProportion, because that is something you would have already
defined from the config. If you really need to change dropout-per-frame
during training, I suggest add a separate function like SetDropoutPerFrame
to the DropoutComponent.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu5wt69dE9fF9wscEV5itDUtytVnxks5rJD2SgaJpZM4LPFOn>
.
|
@@ -243,6 +244,7 @@ if [ $stage -le 16 ]; then | |||
--egs.chunk-left-context $chunk_left_context \ | |||
--egs.chunk-right-context $chunk_right_context \ | |||
--trainer.dropout-schedule $dropout_schedule \ | |||
--trainer.dropout-per-frame $dropout_per_frame \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as Vimal says, please remove this from the training code... does not need to be there.
src/cudamatrix/cu-kernels-ansi.h
Outdated
@@ -64,6 +64,7 @@ void cudaF_apply_pow(dim3 Gr, dim3 Bl, float* mat, float power, MatrixDim d); | |||
void cudaF_apply_pow_abs(dim3 Gr, dim3 Bl, float* mat, float power, | |||
bool include_sign, MatrixDim d); | |||
void cudaF_apply_heaviside(dim3 Gr, dim3 Bl, float* mat, MatrixDim d); | |||
void cudaF_apply_heaviside_by_row(dim3 Gr, dim3 Bl, float* mat, MatrixDim d); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no need for any of these changes in cudamatrix/... just use CopyColsFromVec.
src/nnet3/nnet-combine.cc
Outdated
@@ -34,7 +34,7 @@ NnetCombiner::NnetCombiner(const NnetCombineConfig &config, | |||
nnet_params_(std::min(num_nnets, config_.max_effective_inputs), | |||
NumParameters(first_nnet)), | |||
tot_input_weighting_(nnet_params_.NumRows()) { | |||
SetDropoutProportion(0, &nnet_); | |||
SetDropoutProportion(0, false, &nnet_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this 'false' argument from the function... just make it a fixed property of the component that can't be changed after you initialize.
src/nnet3/nnet-simple-component.cc
Outdated
@@ -87,27 +87,37 @@ void PnormComponent::Write(std::ostream &os, bool binary) const { | |||
} | |||
|
|||
|
|||
void DropoutComponent::Init(int32 dim, BaseFloat dropout_proportion) { | |||
void DropoutComponent::Init(int32 dim, BaseFloat dropout_proportion, bool dropout_per_frame) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please watch line length (80-char limit)
src/nnet3/nnet-simple-component.cc
Outdated
if (!ok || cfl->HasUnusedValues() || dim <= 0 || | ||
dropout_proportion < 0.0 || dropout_proportion > 1.0) | ||
KALDI_ERR << "Invalid initializer for layer of type " | ||
<< Type() << ": \"" << cfl->WholeLine() << "\""; | ||
Init(dim, dropout_proportion); | ||
if( ! ok2 ) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need a branch here because dropout_per_frame defaults to false if not set (that's how you
initialized the variable). Don't have the 'ok2' variable; you don't need to check the return status of
cfl->GetValue("dropout-per-frame", &dropout_per_frame);
because it is an optional parameter.
src/nnet3/nnet-simple-component.cc
Outdated
{ | ||
// This const_cast is only safe assuming you don't attempt | ||
// to use multi-threaded code with the GPU. | ||
const_cast<CuRand<BaseFloat>&>(random_generator_).RandUniform(out); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, you'll want to create a temporary vector with dimension equal to the num-rows in your 'in'/'out' matrices, and do the rand stuff on that, then you'll need CopyColsFromVec().
src/nnet3/nnet-simple-component.cc
Outdated
@@ -154,6 +186,8 @@ void DropoutComponent::Read(std::istream &is, bool binary) { | |||
ReadBasicType(is, binary, &dim_); | |||
ExpectToken(is, binary, "<DropoutProportion>"); | |||
ReadBasicType(is, binary, &dropout_proportion_); | |||
ExpectToken(is, binary, "<DropoutPerFrame>"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. and, of course, make this back compatible.
@@ -87,11 +87,11 @@ class PnormComponent: public Component { | |||
// "Dropout: A Simple Way to Prevent Neural Networks from Overfitting". | |||
class DropoutComponent : public RandomComponent { | |||
public: | |||
void Init(int32 dim, BaseFloat dropout_proportion = 0.0); | |||
void Init(int32 dim, BaseFloat dropout_proportion = 0.0, bool dropout_per_frame = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please watch line length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line too long
src/nnet3/nnet-simple-component.h
Outdated
void SetDropoutProportion(BaseFloat dropout_proportion, bool dropout_per_frame) { | ||
dropout_proportion_ = dropout_proportion; | ||
dropout_per_frame_ = dropout_per_frame; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Odd indentation. Make sure you are not introducing tabs into the file.
If you use emacs, use
(setq-default tab-width 4)
(setq-default fill-column 80)
(setq-default indent-tabs-mode `nil)
(add-hook 'write-file-hooks 'delete-trailing-whitespace)
(load-file "~/.google-c-style.el")
(add-hook 'c-mode-common-hook 'google-set-c-style)
(add-hook 'c-mode-common-hook 'google-make-newline-indent)
You can find google-c-style.el
from a web search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danpovey updating the new version, sorry for incorrect format, I'm using visual studio code/sublime, and so far I haven't found a format tool that could keep exactly the same as the way we are using in Kaldi (I have tried to format nnet-simple-component.cc, it changes a lot to the existing code, though it's also google style, ). I PR the nnet-simple-component.cc
formated by
format tool under sublime.....
src/nnet3/nnet-simple-component.h
Outdated
virtual std::string Info() const; | ||
|
||
void SetDropoutProportion(BaseFloat dropout_proportion) { dropout_proportion_ = dropout_proportion; } | ||
void SetDropoutProportion(BaseFloat dropout_proportion, bool dropout_per_frame) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. remove dropout_per_frame..
#Final valid prob -0.245208 -0.246 | ||
#Final train prob (xent) -1.47648 -1.54 | ||
#Final valid prob (xent) -2.16365 -2.10 | ||
#System tdnn_lstm1i_sp_bi_ihmali_ld5 tdnn_lstm1i_dp_sp_bi_ihmali_ld5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you do a new experiment you should create a different letter/number combination, e.g. 1j, and use the 'compare_wer_general.sh' script or whatever it's called to compare with the baseline, if possible. please stay within the existing conventions for script naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... also, if the per-frame dropout turns out, in the end, not to be that useful, we might not want to check it into Kaldi. But let's see how your experiments turn out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danpovey it would be better if you could have a look at whether my nnet-simple-component.cc
in PR has the right format.....
This reverts commit 463a4dc.
here existing problem as follow: