\chapter{Processing} % TODO: cool quote, if I can think of one \clearpage From a data science perspective, CMDS has several unique challenges: \begin{ditemize} \item Dimensionality of datasets can typically be greater than two, complicating \textbf{representation}. \item Shape and dimensionality change... \item Data can be large (over one million points). % TODO: contextualize large (not BIG DATA) \end{ditemize} I have designed a software package that directly addresses these issues. % WrightTools is a software package at the heart of all work in the Wright Group. % % TODO: more intro WrightTools is written in Python, and endeavors to have a ``pythonic'', explicit and ``natural'' application programming interface (API). % To use WrightTools, simply import: \begin{codefragment}{python} >>> import WrightTools as wt >>> wt.__version__ 3.0.0 \end{codefragment} I'll discuss more about how exactly WrightTools packaging, distribution, and instillation works in \autoref{sec:processing_distbribution}. We can use the builtin Python function \mintinline{python}{dir} to interrogate the contents of the WrightTools package. % \begin{codefragment}{python} >>> dir(wt) ['Collection', 'Data', '__branch__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', '__wt5_version__', '_dataset', '_group', '_open', '_sys', 'artists', 'collection', 'data', 'diagrams', 'exceptions', 'kit', 'open', 'units'] \end{codefragment} Many of these are dunder (double underscore) attributes---Python internals that are not normally used directly. % The ten attributes that do not start with underscore are the public API that users of WrightTools typically use. % Within the public API are two classes, \mintinline{python}{Collection} \& \mintinline{python}{Data}, which are the two main classes in the WrightTools object model. % \mintinline{python}{Data} stores spectra directly as multidimensional arrays, and \mintinline{python}{Collection} stores \textit{groups} of data objects (and other collection objects) in a hierarchical way for internal organization purposes. % \section{Data object model} % ==================================================================== WrightTools uses a programming strategy called object oriented programming (OOP). % It contains a central data ``container'' that is capable of storing all of the information about each multidimensional (or one-dimensional) spectra: the \mintinline{python}{Data} class. % It also defines a \mintinline{python}{Collection} class that contains data objects, collection objects, and other pieces of metadata in a hierarchical structure. % Let's first discuss \mitinline{python}{Data}. All spectra are stored within WrightTools as multidimensional arrays. % Arrays are containers that store many instances of the same data type, typically numerical datatypes. % These arrays have some \mintinline{python}{shape}, \mintinline{python}{size}, and \mintinline{python}{size}. % In the context of WrightTools, they can contain floats, integers, complex numbers and NaNs. % The \mintinline{python}{Data} class contains everything that is needed to define a single spectra from a single experiment (or simulation). % To do this, each data object contains several multidimensional arrays (typically 2 to 50 arrays, depending on the kind of data). % There are two kinds of arrays, instances of \mintinline{python}{Variable} and \mintinline{python}{Channel}. % Variables are coordinate arrays that define the position of each pixel in the multidimensional spectrum, and channels are each a particular kind of signal within that spectrum. % Typical variables might be \mintinline{python}{[w1, w2, w3, d1, d2]}, and typical channels \mintinline{python}{[pmt, pyro1, pyro2, pyro3]}. % \subsection{Dimensionality manipulation} % ------------------------------------------------------- \subsubsection{Chop} \subsubsection{Collapse} \subsubsection{Split} \section{Artists} % ============================================================================== After importing and manipulating data, one typically wants to create a plot. % The artists sub-package contains everything users need to plot their data objects. % This includes both ``quick'' artists, which generate simple plots as quickly as possible, and a full figure layout toolkit that allows users to generate full publication quality figures. % It also includes ``specialty'' artists which are made to perform certain popular plotting operations, as I will describe below. % Currently the artists sub-package is built on-top of the wonderful matplotlib library. % In the future, other libraries (e.g. mayavi), may be incorporated. % \subsection{Quick} % ----------------------------------------------------------------------------- \subsubsection{1D} \subsubsection{2D} \subsection{Specialty} % ------------------------------------------------------------------------- \subsection{Artists API} % ----------------------------------------------------------------------- \subsection{Colormaps} % ------------------------------------------------------------------------- \subsection{Interpolation} % --------------------------------------------------------------------- \section{Fitting} % ============================================================================== \section{Distribution and licensing} \label{sec:processing_disbribution} % ======================= \section{Future directions} % ====================================================================