Training details

This topic has 1 reply, 2 voices, and was last updated 8 years, 7 months ago by Srikanth R.

Viewing 1 reply thread

Author

Posts
- August 2, 2016 at 20:32 #4049
  Pilar O
  Tutor
  Hi, I want to double check some details about the implementation and the code:
  
  1) I saw this in “dnn.py”, are we actually using it like this?:
  
  ##top 2 layers use a smaller learning rate
  ##hard-code now, change it later
  if layer_size > 4:
  for i in range(layer_size-4, layer_size):
  lr_list[i] = learning_rate * 0.5
  
  2) Is the initialization of the weights completely random?
  
  3) Are we using any drop-out value?
  
  Thanks!
- August 3, 2016 at 12:43 #4057
  Srikanth R
  Student
  1) Yes, if you use run_dnn.py — then it automatically uses 0.5 times the learning rate for top hidden layers (>4). It’s not the same case for run_lstm.py
  
  2) they are not completely random, they use gaussian random generator with zero mean and variance based on input size.
  https://www.random.org/gaussian-distributions/ (something like this)
  
  3) drop-out is coded in run_lstm.py with variable “dropout_rate” — by default set to 0 — you can add this variable to your configuration file under “Architecture” and fine-tune.
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.