aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorBlaise Thompson <blaise@untzag.com>2018-04-03 10:20:14 -0500
committerBlaise Thompson <blaise@untzag.com>2018-04-03 10:20:14 -0500
commitf17a23c402dce796a0e7483d8a822eb6c874d489 (patch)
treec81a3fdd00350d2cdd508de41e0f999cc8868414
parent115d8142cc8777e0caa26867c6eb1b292bd60cf3 (diff)
2018-04-03 10:20
-rw-r--r--software/chapter.tex73
1 files changed, 62 insertions, 11 deletions
diff --git a/software/chapter.tex b/software/chapter.tex
index 83fd207..2a6cce2 100644
--- a/software/chapter.tex
+++ b/software/chapter.tex
@@ -1,8 +1,4 @@
% TODO: add StoddenVictoria2016a (Enhancing reproducibility for computational methods)
-% TODO: add MillmanKJarrod2011a (Python for Scientists and Engineers)
-% TODO: add vanderWaltStefan2011a (The NumPy Array: A Structure for Efficient Numerical Computation)
-% TODO: reference https://www.nsf.gov/pubs/2016/nsf16532/nsf16532.htm (Software Infrastructure for
-% Sustained Innovation (SI2: SSE & SSI))
% TODO: http://pubs.acs.org/doi/10.1021/cen-09535-scitech2
\chapter{Software}
@@ -59,7 +55,7 @@ This is in part due to the their general lack of formal training in programming
development. \textcite{HannayJoErskine2009a} found that over 90\% of scientists learn software
development through `informal self study', while \textcite{SegalJudith2004a} mentions that
``[scientists] do not describe themselves as software developers and have little formal education
-or training in software development''. HannayJoErskine2009a agrees.
+or training in software development''. HannayJoErskine2009a agrees. JoppaLucasN2013a aggrees.
This lack of training is not in-and-of-itself a problem. %
After all, academic scientists are required to be ``do-it-yourself''ers in many contexts for which
@@ -95,6 +91,13 @@ Great software makes science easier, faster, and often of higher quality. %
And making great software isn't necessarily harder than the development practices that scientists
are following today---indeed sometimes it is easier to follow best practices. %
+In the United States, funding agencies have recognized the crucial role that software plays in
+science. %
+The National Science Foundation has a long-running ``Software Infrastructure for Sustained
+Innovation'' (SI$^2$) program, which endeavors to take a ``leadership role in providing software as
+enabling infrastructure for science and engineering research'' [CITE https://www.nsf.gov/pubs/2012/nsf12113/nsf12113.pdf].
+% https://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503489
+
\section{Challenges in scientific software development} % ========================================
Software development ``by-and-for'' scientists poses unique challenges. %
@@ -229,7 +232,7 @@ Now I can make some instances of that class, and access their attributes and met
>>> mary = Person(name='Mary', favorite_food='pizza', hated_food='falafel')
>>> jane = Person(name='Jane', favorite_food='salad')
>>> mary.react_to('falafel')
-'gross---no thank you'
+'gross---no thank you'''
>>> jane.react_to('salad')
'yum! my favorite'
>>> mary.favorite_food
@@ -237,16 +240,64 @@ Now I can make some instances of that class, and access their attributes and met
>>> jane.react_to(mary.favorite_food)
'meh'
\end{codefragment}
-To the knowledge of this author
+We can already begin to see how powerful this approach is. %
+Instances of \python{Person} contain their own attributes and methods. %
+Instances can be interacted with in complex or simple ways. %
+The attributes \python{favorite_food} and \python{hated_food} are fully accessible, but need not be
+directly dealt with when using the \python{read_to} method. %
+When using OOP, one can hide complexity while still being able to access everything. %
+
+One of the most powerful patterns within OOP is \emph{inheritance}. %
+Inheritance is a special relationship between classes. %
+When a class (the child) is made to inherit from another class (the parent), all of the attributes
+and methods of the parent come automatically. %
+The child class, then, can benefit from all of the behaviors enabled by its parent while still
+maintaining its own identity where needed.
+The inheritance pattern makes it very easy to cleanly define expectations and shared structure
+throughout a large piece of software without repeating functionality. %
+
+% TODO: more exposition on inheritance, perhaps including an example
+
+OOP is a deep subject with many patterns and concepts behind it. %
+There are many places to read further [CITES].
+I recommend The Quarks of Object-Oriented Development, by \textcite{ArmstrongDeborahJ2006a}. %
\section{Hierarchical data format} % -------------------------------------------------------------
-FITS
+One of the particularly important challenges in MR-CMDS is data storage. %
+MR-CMDS datasets are multi-dimensional, and the particular dimensions are different from experiment
+to experiment. %
+Historically, the Wright Group has stored data as ``flattened'' arrays in plain text, where each
+column corresponds to one of the scannable hardwares or one of the sensors in the experiment. %
+The simplicity and portability of these formats is fantastic, but they do not scale well with
+increasingly large and higher-dimensional data. %
+
+% TODO: justify further why flattening UTF8 files are bad idea
+
+Heirarchial data files are an alternative strategy that scales much better with large and
+high-dimensional data. %
+
+Originally, CDF \cite{TreinshLloydA1987a}. %
+Support ``random access to data, so that efficient access of small portions or large data files
+would be possible''. %
-HDF5
+Then, NetCDF \cite{RewRuss1990a}.
+More portability. %
+Named dimensions. %
+Metadata. %
+``Hyperslab''
-CDF (Common Data Format)
+FITS used by astronomy community, with a focus on backwards compatibility. %
+\cite{WellsDC1981a}
+% CITE https://fits.gsfc.nasa.gov/
+% CONSIDER CITING https://fits.gsfc.nasa.gov/rfc4047.txt
+
+
+I have chosen to build off of HDF5. %
\section{Scientific Python} % --------------------------------------------------------------------
-Numpy, SciPy \ No newline at end of file
+Numpy, SciPy
+
+% TODO: add MillmanKJarrod2011a (Python for Scientists and Engineers)
+% TODO: add vanderWaltStefan2011a (The NumPy Array: A Structure for Efficient Numerical Computation) \ No newline at end of file