Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

My changes to add GPU integration + rebase #3

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
0f3741d
Move the exercies from the basic tutorial to a ipython notebook.
abergeron Feb 23, 2015
12d6442
Clean up the headers.
abergeron Feb 24, 2015
2558f9e
Remove useless repeated lstsets.
abergeron Feb 24, 2015
dc0beb7
Don't use \frametitle.
abergeron Feb 24, 2015
6885882
Update the wording on some slides, make it pretty and change the erro…
abergeron Feb 24, 2015
ec759d9
Update the README file.
abergeron Feb 26, 2015
21ac329
Fix formatting
abergeron Feb 26, 2015
6e513ce
opt version of grad code and warn_float64
nouiz Mar 30, 2015
304e035
small modif
nouiz Mar 31, 2015
2237428
Add a slide about broadcasting to the presentation
ejls Apr 10, 2015
c69fe85
Update the COp section to reflect the current API.
abergeron Oct 28, 2015
4e3eee3
Replace the make_thunk section with a section that introduces Op params.
abergeron Oct 28, 2015
4adc851
Fix the exercies so that the solutions actually work
abergeron Oct 29, 2015
180781d
Add missing params.py file for the new params section.
abergeron Oct 29, 2015
1204eda
Add a note about the signature change for perform with params.
abergeron Feb 3, 2016
330372f
Update the overview page to reflect reality.
abergeron Feb 4, 2016
a770924
Fix infer_shape prototype.
abergeron Feb 10, 2016
7ebeae1
Rebuild the pdfs after the rebase.
abergeron Jan 31, 2017
9218283
Ignore the latex temp files.
abergeron Jan 31, 2017
f686f5b
Change grad to L_op.
abergeron Jan 31, 2017
06537a1
Add a section about GPU ops.
abergeron Feb 1, 2017
9956c85
Revise the optimization section to explain the GPU lifting.
abergeron Feb 1, 2017
7b7a725
Update the PDF.
abergeron Feb 1, 2017
f92aa8e
Fixes notices during presentation.
abergeron Feb 1, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,13 @@ nosetests.xml
.mr.developer.cfg
.project
.pydevproject

# Latex stuff
*.aux
*.log
*.nav
*.out
*.snm
*.synctex.gz
*.toc
*.vrb
4 changes: 2 additions & 2 deletions 06_scalmulop/01_scalmulop_soln.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from theano import Op, Apply
from theano.tensor import as_tensor_variable
from theano.scalar import as_scalar_variable
from theano.scalar import as_scalar

class ScalMulV1(Op):
__props__ = ('scal',)
Expand All @@ -25,7 +25,7 @@ class ScalMulV2(Op):

def make_node(self, x, scal):
x = as_tensor_variable(x)
scal = as_scalar_variable(scal)
scal = as_scalar(scal)
return Apply(self, [x, scal], [x.type()])

def perform(self, node, inputs, output_storage):
Expand Down
1 change: 0 additions & 1 deletion 07_scalmulgrad/01_scalmulop.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
from theano import Op, Apply
from theano.tensor import as_tensor_variable
from theano.scalar import as_scalar_variable

class ScalMul(Op):
__props__ = ('scal',)
Expand Down
1 change: 0 additions & 1 deletion 07_scalmulgrad/01_scalmulop_soln.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
from theano import Op, Apply
from theano.tensor import as_tensor_variable
from theano.scalar import as_scalar_variable

class ScalMul(Op):
__props__ = ('scal',)
Expand Down
35 changes: 34 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,37 @@
ccw_tutorial_theano
===================

Common Code Workflow tutorial on Theano
This repo contains two theano tutorials.
The first one covers the basics of running and debugging theano code.
The second one covers extending theano in python and C.

Basic tutorial
--------------

This tutorial covers:

* Overview of library (3 min)
* Building expressions (30 min)
* Compiling and running expressions (30 min)
* Modifying expressions (25 min)
* Debugging (30 min)
* Citing Theano (2 min)

In order to follow this tutorial you will need the ipython-notebook
python package on your computer and a clone of this repo to get the
notebook with exercices.

The following commands should perform the correct installation on most
unix-like machines:

pip install ipython-notebook
git clone https://github.com/abergeron/ccw_tutorial_theano.git
cd ccw_tutorial_theano/ipnb
ipython notebook Theano-basic.ipynb

This should open your browser to the notebook page.

Advanced tutorial
-----------------

COMING SOON
Binary file modified advanced.pdf
Binary file not shown.
128 changes: 62 additions & 66 deletions advanced.tex
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
\documentclass[utf8x]{beamer}

% \usepackage{beamerthemesplit} // Activate for custom appearance
\usepackage[utf8x]{inputenc}
\usepackage[OT1]{fontenc}
\usepackage{graphicx}
Expand Down Expand Up @@ -35,8 +34,6 @@
tabsize=4,
backgroundcolor=\color{lightgray},
frame=single,
%showlines=true,
%emph={theano,MyOp,DoubleOp}, emphstyle=\color{lightblue}\bfseries,
emph={[2]__init__,make_node,perform,infer_shape,c_code,make_thunk,grad,R_op},emphstyle={[2]\color{methblue}},
emph={[3]self},emphstyle={[3]\color{darkgreen}},
moredelim=**[is][{\color{red}}]{`}{`}
Expand All @@ -54,7 +51,7 @@ \section*{}
\begin{enumerate}
\item How to Make an Op (Python) (45 min)
\item How to Make an Op (C) (30 min)
\item How to Make a Complex Op (10 min)
\item Op Params (10 min)
\item Optimizations (20 min)
\end{enumerate}
\end{frame}
Expand Down Expand Up @@ -150,28 +147,6 @@ \section{How to Make an Op (Python)}
\end{lstlisting}

You can also use \texttt{theano-nose} which is a wrapper around \texttt{nosetests} with some extra options.

\end{frame}

\begin{frame}{Exercise: TripleOp}
What would need to be changed in the code below (DoubleOp) to make this Op triple the input instead of double?
\lstinputlisting[lastline=15]{doubleop.py}
\end{frame}

\begin{frame}{Solution: TripleOp}
You change the class name and the constant \code{2} for a constant \code{3}. \\
\
\lstinputlisting[lastline=15]{tripleop.py}
\end{frame}

\begin{frame}{Exercise: ScalMulOp}
\begin{center}
Work though the "06\_scalmulop" directory available at \url{https://github.com/abergeron/ccw_tutorial_theano.git}.
\end{center}
\begin{itemize}
\item Take the \code{DoubleOp} code and make it work with an arbitrary scalar
\item There are more than one solution possible, both have advantages and disadvantages
\end{itemize}
\end{frame}

\begin{frame}{\code{infer_shape}}
Expand Down Expand Up @@ -217,15 +192,7 @@ \section{How to Make an Op (Python)}
\begin{frame}{Tests}
To test the gradient we use \code{verify_grad}
\lstinputlisting[linerange={5-5,36-44}]{test_doubleop.py}
It will compute the gradient numerically and symbolically (using our \code{grad()} method) and compare the two.
\end{frame}

\begin{frame}{Exercice: Add Special Methods to ScalMulOp}
Work through the "07\_scalmulgrad" directory available at \url{https://github.com/abergeron/ccw_tutorial_theano.git}
\begin{itemize}
\item Take the ScalMulOp class you made and add the \code{infer_shape} and \code{grad} methods to it.
\item Don't forget to make tests for your new class to make sure everything works correctly.
\end{itemize}
It will compute the gradient numerically and symbolically (using our \code{L_op()} method) and compare the two.
\end{frame}

\section{How to Make an Op (C)}
Expand Down Expand Up @@ -302,15 +269,15 @@ \section{How to Make an Op (C)}

\begin{frame}{Constructor Arguments}
\begin{itemize}
\item Basically you just pass two arguments to the constructor of COp
\item Basically you just pass arguments to the constructor of COp
\begin{itemize}
\item Either by calling the constructor directly \code{COp.__init__(self, ...)}
\item Or via the superclass \code{super(MyOp, self).__init__(...)}
\end{itemize}
\item The two arguments are:
\item The arguments are:
\begin{itemize}
\item the name of the C code file
\item the name of the function to call to make the computation
\item a list of file names with code sections (relative to the location of the op class)
\item the name of a function to call to make the computation (optional)
\end{itemize}
\end{itemize}
\end{frame}
Expand Down Expand Up @@ -342,32 +309,65 @@ \section{How to Make an Op (C)}
\end{itemize}
\end{frame}

\begin{frame}{Exercice: Add C Code to ScalMulOp}
Work through the "08\_scalmulc" directory available at \url{https://github.com/abergeron/ccw_tutorial_theano.git}.
\section{Op Params}

\begin{frame}[plain]{}
\begin{center}
\Huge Op Params
\end{center}
\end{frame}

\begin{frame}{Purpose}
\begin{itemize}
\item Take the ScalMulOp from before and write C code for it using either approach (only accept vectors).
\item You can base yourself on the C code for DoubleOp.
\item Don't forget to test your new implementation! Be sure to check for invalid inputs (matrices).
\item Used to pass information to the C code
\item Can reduce the amount of compiled C code
\item Required for things that can change from one script run to the other.
\end{itemize}
\end{frame}

\section{How to Make a Complex Op}
\begin{frame}{Usage}
\lstinputlisting{params.py}
\end{frame}

\section{GPU Ops}

\begin{frame}[plain]{}
\begin{center}
\Huge How to Make a Complex Op
\Huge GPU Ops
\end{center}
\end{frame}

\begin{frame}{\code{make_thunk}}
\lstinputlisting[linerange={12-14}]{thunk.py}
\begin{frame}{Overview}
\only<1>{\lstinputlisting[linerange=1-12]{gpu.py}}
\only<2>{\lstinputlisting[linerange=14-20]{gpu.py}
\begin{itemize}
\item \texttt{params\_type} is new.
\item \texttt{get\_params} is new.
\end{itemize}}
\end{frame}

\begin{frame}{Context and Context Name}
\begin{itemize}
\item Define instead of \code{perform} or \code{c_code}
\item Gives total freedom on how the computation is performed
\item More complex to use and generally not needed
\item Context is what is used to refer to the chosen GPU.

It is a C object that can't be serialized.
\item Context Name is a name internal to Theano to refer to a given context object. It is a python string.
\item Context Names are used whenever you need a symbolic object.
\end{itemize}
\end{frame}

\begin{frame}{Double on GPU}
\only<1>{\lstinputlisting[linerange=5-21]{doublegpu.py}}
\only<2>{\lstinputlisting[linerange=22-37]{doublegpu.py}}
\only<3>{\lstinputlisting[linerange=39-55]{doublegpu.py}}
\end{frame}

\begin{frame}{GpuKernelBase}
\only<1>{\lstinputlisting[linerange=6-20]{doublecgpu.py}}
\only<2>{\lstinputlisting[linerange=1-10]{doublecgpu.c}}
\only<3>{\lstinputlisting[linerange=12-28]{doublecgpu.c}}
\end{frame}

\section{Optimizations}

\begin{frame}[plain]{}
Expand All @@ -384,32 +384,28 @@ \section{Optimizations}
\end{itemize}
\end{frame}

\begin{frame}{Replace an Op (V1)}
\begin{frame}{Replace an Op}
Here is code to use \code{DoubleOp()} instead of \code{ScalMul(2)}.
\lstinputlisting[linerange={1-5,9-15}]{opt.py}
\end{frame}

\begin{frame}{Replace an Op (V2)}
In this case since we are replacing one instance with another there is an easier way.
\lstinputlisting[linerange={1-2,16-20}]{opt.py}
\lstinputlisting[linerange={1-2,7-8,11-20}]{opt.py}
\end{frame}

\begin{frame}{Registering}
In any case you need to register your optimization.
\lstinputlisting[linerange={6-10}]{opt.py}
\lstinputlisting[linerange={21-22}]{opt.py}
\begin{frame}{Replace an Op for GPU}
Here is code to move the Double op to GPU.
\lstinputlisting[linerange={1-5,9-10,22-30}]{opt.py}
\end{frame}

\begin{frame}{Tests}
\lstinputlisting{test_opt.py}
\end{frame}

\begin{frame}{Exercice 4}
Work through the "09\_opt" directory available at \url{https://github.com/abergeron/ccw_tutorial_theano.git}.
\begin{frame}{Exercice}
\begin{itemize}
\item Make an optimization that replace DoubleOp with DoubleC (or DoubleCOp)
\item Write tests to make sure your optimization is applied correctly
\item Implement a ScalMulOp that multiplies its input by an arbitrary scalar value. Start with a python implementation
\item Add C code to your implementation
\item Create a GPU version of your op.
\item Create an optimization that replace the CPU version with a GPU version when appropriate.
\end{itemize}
Clone the repo at \url{https://github.com/abergeron/ccw_tutorial_theano.git}.
\end{frame}

\end{document}
2 changes: 1 addition & 1 deletion cop.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ class MyOp(COp):
__props__ = ()

def __init__(self, ...):
COp.__init__(self, c_file, func_name)
COp.__init__(self, c_files, func_name)
# Other init code if needed

def make_node(self, ...):
Expand Down
29 changes: 29 additions & 0 deletions doublecgpu.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#section kernels
#kernel doublek : *, *, size :

KERNEL void doublek(GLOBAL_MEM DTYPE_o0 *out,
GLOBAL_MEM DTYPE_i0 *a,
ga_size n) {
for (ga_size i = LID_0; i < n; i += LDIM_0) {
out[i] = 2 * a[i];
}
}

#section support_code_struct
int double_fn(PyGpuArrayObject *inp,
PyGpuArrayObject **out,
PyGpuContextObject *ctx) {
size_t n = 1;
Py_XDECREF(*out);
*out = pygpu_empty(PyGpuArray_NDIM(inp),
PyGpuArray_DIMS(inp),
GA_C_ORDER, ctx, Py_None);
if (*out == NULL) return -1;
for (unsigned int i = 0; i < inp->ga.nd; i++)
n *= PyGpuArray_DIM(inp, i);
if (doublek_scall(1, &n, 0, *out, inp, n)) {
PyErr_SetString(PyExc_RuntimeError,
"Error calling kernel");
return -1;
}
}
25 changes: 25 additions & 0 deletions doublecgpu.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
from theano import Apply
from theano.gpuarray.basic_ops import (as_gpuarray_variable,
infer_context_name, CGpuKernelBase)


class DoubleCGpu(CGpuKernelBase):
__props__ = ()

def __init__(self):
CGpuKernelBase.__init__(self, ["doublecgpu.c"],
"double_fn")

def make_node(self, x):
ctx_name = infer_context_name(x)
x = as_gpuarray_variable(x, ctx_name)
return Apply(self, [x], [x.type()])

def get_params(self, node):
return node.outputs[0].type.context

def infer_shape(self, node, input_shapes):
return input_shapes

def grad(self, inputs, output_grads):
return [output_grads[0] * 2]
2 changes: 1 addition & 1 deletion doublecop.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ class DoubleCOp(COp):
__props__ = ()

def __init__(self):
COp.__init__(self, "./doublecop.c",
COp.__init__(self, ["doublecop.c"],
"APPLY_SPECIFIC(doublecop)")

def make_node(self, x):
Expand Down
Loading