Navigate:
PgDn / PgUp - next / previous slide
Esc - expo mode
courtesy of G. Perdue
courtesy of G. Perdue
tracking based algorithms fail for high energy events
"by eye" method is very often more accurate
idea: use algorithms for images analysis and pattern recognition
ImageNet is an image database
Siberian Husky or Alaskan Malamute?
If you can't explain it simply, you don't understand it well enough.
Albert Einstein
epoch = one loop over the whole training sample
for each feature vector weights are updated using gradient descent method
target: \(y = 0, 1\)
not really efficient for classification
imagine having some data ~ 100
We can do classification
We can do regression
But real problems are nonlinear
x XOR y = (x AND NOT y) OR (y AND NOT x)
src: deeplearning.net
src: wildml.com
src: arxiv
src: wildml.com
The first goal is to use CNN to find vertex in nuclear target region
Next steps: NC\(\pi^0\)? \(\pi\) momentum? hadron multiplicities?
test accuracy: 92.67 %
target 0 accuracy: 75.861 %
target 1 accuracy: 94.878 %
target 2 accuracy: 94.733 %
target 3 accuracy: 93.596 %
target 4 accuracy: 90.404 %
target 5 accuracy: 94.011 %
target 6 accuracy: 87.775 %
target 7 accuracy: 85.225 %
target 8 accuracy: 94.109 %
target 9 accuracy: 53.077 %
target 10 accuracy: 96.608 %
In order to attain the impossible, one must attempt the absurd.
Miguel de Cervante
Logistic function: \[g(z) = \frac{1}{1 + e^{-z}}\]
Probability of 1: \[P (y = 1 | x, w) = h(x)\]
Probability of 0: \[P (y = 0 | x, w) = 1 - h(x)\]
Probability: \[p (y | x, w) = (h(x))^y\cdot(1 - h(x))^{1 - y}\]
Likelihood: \[L(w) = \prod\limits_{i=0}^n p(y^{(i)} | x^{(i)}, w) = \prod\limits_{i=0}^n (h(x^{(i)}))^{y^{(i)}}\cdot(1 - h(x^{(i)}))^{1 - y^{(i)}}\]
Log-likelihood: \[l(w) = \log L(w) = \sum\limits_{i=0}^n y^{(i)}\log h(x^{(i)}) + (1 - y^{(i)})\log (1-h(x^{(i)}))\]
Learning step (maximize \(l(w)\)): \[w_j = w_j + \alpha\frac{\partial l(w)}{\partial w_j} = w_j + \alpha\sum\limits_{i=0}^n\left(y^{(i)} - h (x^{(i)})\right)x_j\]
Feature vector: \[(x,y) \rightarrow (x,y,x^2,y^2)\]
Hypothesis: \[h (x) = \frac{1}{1 + e^{-w_0 - w_1x - w_2y - w_3x^2 - w_4y^2}}\]
In general, adding extra dimension by hand would be hard / impossible. Neural networks do that for us.
\(x_1\) | 0 | 1 | 0 | 1 |
\(x_2\) | 0 | 0 | 1 | 1 |
AND | 0 | 0 | 0 | 1 |
\[h(x) = \frac{1}{1 + e^{-w^Tx}}\]
Intuition: