aboutsummaryrefslogtreecommitdiff
path: root/processing/chapter.tex
blob: 9cd0954bd80efb2aa0109dbe32657f6dea75ca28 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
\chapter{Processing}

% TODO: cool quote, if I can think of one

\clearpage

From a data science perspective, CMDS has several unique challenges:
\begin{ditemize}
  \item Dimensionality of datasets can typically be greater than two, complicating
    \textbf{representation}.
  \item Shape and dimensionality change...
  \item Data can be large (over one million points).  % TODO: contextualize large (not BIG DATA)
\end{ditemize}
I have designed a software package that directly addresses these issues.  %

WrightTools is a software package at the heart of all work in the Wright Group.  %

% TODO: more intro


WrightTools is written in Python, and endeavors to have a ``pythonic'', explicit and ``natural''
application programming interface (API).  %
To use WrightTools, simply import:
\begin{codefragment}{python}
>>> import WrightTools as wt
>>> wt.__version__
3.0.0
\end{codefragment}
I'll discuss more about how exactly WrightTools packaging, distribution, and instillation works in
\autoref{sec:processing_distbribution}.

We can use the builtin Python function \mintinline{python}{dir} to interrogate the contents of the
WrightTools package.  %
\begin{codefragment}{python}
>>> dir(wt)
['Collection',
 'Data',
 '__branch__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 '__wt5_version__',
 '_dataset',
 '_group',
 '_open',
 '_sys',
 'artists',
 'collection',
 'data',
 'diagrams',
 'exceptions',
 'kit',
 'open',
 'units']
\end{codefragment}
Many of these are dunder (double underscore) attributes---Python internals that are not normally
used directly.  %
The ten attributes that do not start with underscore are the public API that users of WrightTools
typically use.  %
Within the public API are two classes, \mintinline{python}{Collection} \&
\mintinline{python}{Data}, which are the two main classes in the WrightTools object model.  %
\mintinline{python}{Data} stores spectra directly as multidimensional arrays, and
\mintinline{python}{Collection} stores \textit{groups} of data objects (and other collection
objects) in a hierarchical way for internal organization purposes.  %

\section{Data object model}  % ====================================================================


WrightTools uses a programming strategy called object oriented programming (OOP).  %

It contains a central data ``container'' that is capable of storing all of the information about
each multidimensional (or one-dimensional) spectra: the \mintinline{python}{Data} class.  %
It also defines a \mintinline{python}{Collection} class that contains data objects, collection
objects, and other pieces of metadata in a hierarchical structure.  %
Let's first discuss \mitinline{python}{Data}.

All spectra are stored within WrightTools as multidimensional arrays.  %
Arrays are containers that store many instances of the same data type, typically numerical
datatypes.  %
These arrays have some \mintinline{python}{shape}, \mintinline{python}{size}, and
\mintinline{python}{size}.  %
In the context of WrightTools, they can contain floats, integers, complex numbers and NaNs.  %

The \mintinline{python}{Data} class contains everything that is needed to define a single spectra
from a single experiment (or simulation).  %
To do this, each data object contains several multidimensional arrays (typically 2 to 50 arrays,
depending on the kind of data).  %
There are two kinds of arrays, instances of \mintinline{python}{Variable} and
\mintinline{python}{Channel}.  %
Variables are coordinate arrays that define the position of each pixel in the multidimensional
spectrum, and channels are each a particular kind of signal within that spectrum.  %
Typical variables might be \mintinline{python}{[w1, w2, w3, d1, d2]}, and typical channels
\mintinline{python}{[pmt, pyro1, pyro2, pyro3]}.  %

\section{Artists}  % ==============================================================================

\subsection{Colormaps}  % -------------------------------------------------------------------------

\subsection{Interpolation}  % ---------------------------------------------------------------------

\section{Fitting}  % ==============================================================================



\section{Distribution and licensing} \label{sec:processing_disbribution}  % =======================


\section{Future directions}  % ====================================================================