I am trying to use an MLP for classification (yes, I’m aware of dnn, but now I’m interested in ml) so I put together a simple example to test things with a simple formula: array of classId / classesCount values => zero-filled vector with 1 at classId index:
In this specific example - yes, just flat input data for now.
Not that I’m aware of at least. In case of CNNs it’d depend on actual network configuration but ml gives no such tools, only an activation function and learning conditions.
Randomized as in not all the inputs to be the same? I tried it with actual 20x20 monochrome images (hence 400 input neurons) but was getting equally strange results. So I decided to try in on this more simple data instead to confirm it’s working as it should.
Thanks. Tried it now, but got even more confused: all scales and weights are now -0. which results in all 0 outputs.
that would mean - you’d want a regression, not a classification
(like: ‘learn’ the input function, and produce similar output)
(and sadly, this isn’t possible with ANN_MLP, since it would need a linear activation for the last layer, and you can’t set activation funcs per layer independently)
So regression as in “take 10 smartphone parameters - give a single value rating how good it is”? Or no?
UPD: Does it mean a simple neural network with about equal structure, but different activation functions for different layers might handle this task?
UPD2: Wait, so I’ve remembered I was making an even more simple MLP (also for image classification (if I understand this term correctly — it had to recognize handwriting) back in the university, and it had 2 layers with linear activation. Might it be that a hidden layer is messing up result in this specific case?
UPD3: Apparently not. With just 2 layers it’s back to the same result regardless of activation type and
well, IDK how ANN_MLP training is actually implemented.
in a dense layer, if the weights of one “neuron” are equal to the weights of another, they behave the same and any updates from training will affect them the same, so they’ll end up being twins forever… and if the whole layer’s weights are initialized the same, the whole layer would be near worthless because it’s equivalent to a single neuron.
there has to be some randomness that affects neurons individually. it’s either in the initialization or somewhere in the training.
stochastically picking training data for a batch may be random but it wouldn’t cause neurons to differentiate.
maybe the docs lie and the weights aren’t initialized to all 0
maybe they inject some randomness during training, in the right places
since you showed one output that looks kinda one-hot, maybe the network did train decently, but there’s some issue with indexing? you said inputs didn’t match the outputs you expected… but you did see outputs always looking somewhat one-hot, or did you see any other patterns?
It turned out to be not very helpful unfortunately. It uses a dataset not of actual images, but a dataset with 17 parameters of images such as total number of pixels of an object (letter) or different means and correlations you have to figure out from an image yourself.
I’ve tried to predict with a single [400x1] vector as well (the last one to be precise) - to get also 1-dimensional ([10x1]) output. And I’m getting the same data.
Other variation is the same pattern (0, 2…9 are the same, 1 is different), but with those 2 values switching places.