diff options
| -rw-r--r-- | software/chapter.tex | 73 | 
1 files changed, 62 insertions, 11 deletions
| diff --git a/software/chapter.tex b/software/chapter.tex index 83fd207..2a6cce2 100644 --- a/software/chapter.tex +++ b/software/chapter.tex @@ -1,8 +1,4 @@  % TODO: add StoddenVictoria2016a (Enhancing reproducibility for computational methods)
 -% TODO: add MillmanKJarrod2011a (Python for Scientists and Engineers)
 -% TODO: add vanderWaltStefan2011a (The NumPy Array: A Structure for Efficient Numerical Computation)
 -% TODO: reference https://www.nsf.gov/pubs/2016/nsf16532/nsf16532.htm (Software Infrastructure for
 -% Sustained Innovation (SI2: SSE & SSI))
  % TODO: http://pubs.acs.org/doi/10.1021/cen-09535-scitech2
  \chapter{Software}
 @@ -59,7 +55,7 @@ This is in part due to the their general lack of formal training in programming  development. \textcite{HannayJoErskine2009a} found that over 90\% of scientists learn software
  development through `informal self study', while \textcite{SegalJudith2004a} mentions that
  ``[scientists] do not describe themselves as software developers and have little formal education
 -or training in software development''. HannayJoErskine2009a agrees. 
 +or training in software development''. HannayJoErskine2009a agrees. JoppaLucasN2013a aggrees.
  This lack of training is not in-and-of-itself a problem.  %
  After all, academic scientists are required to be ``do-it-yourself''ers in many contexts for which
 @@ -95,6 +91,13 @@ Great software makes science easier, faster, and often of higher quality.  %  And making great software isn't necessarily harder than the development practices that scientists
  are following today---indeed sometimes it is easier to follow best practices.  %
 +In the United States, funding agencies have recognized the crucial role that software plays in
 +science.  %
 +The National Science Foundation has a long-running ``Software Infrastructure for Sustained
 +Innovation'' (SI$^2$) program, which endeavors to take a ``leadership role in providing software as
 +enabling infrastructure for science and engineering research'' [CITE https://www.nsf.gov/pubs/2012/nsf12113/nsf12113.pdf].
 +% https://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503489
 +
  \section{Challenges in scientific software development}  % ========================================
  Software development ``by-and-for'' scientists poses unique challenges.  %
 @@ -229,7 +232,7 @@ Now I can make some instances of that class, and access their attributes and met  >>> mary = Person(name='Mary', favorite_food='pizza', hated_food='falafel')
  >>> jane = Person(name='Jane', favorite_food='salad')
  >>> mary.react_to('falafel')
 -'gross---no thank you'
 +'gross---no thank you'''
  >>> jane.react_to('salad')
  'yum! my favorite'
  >>> mary.favorite_food
 @@ -237,16 +240,64 @@ Now I can make some instances of that class, and access their attributes and met  >>> jane.react_to(mary.favorite_food)
  'meh'
  \end{codefragment}
 -To the knowledge of this author
 +We can already begin to see how powerful this approach is.  %
 +Instances of \python{Person} contain their own attributes and methods.  %
 +Instances can be interacted with in complex or simple ways.  %
 +The attributes \python{favorite_food} and \python{hated_food} are fully accessible, but need not be
 +directly dealt with when using the \python{read_to} method.  %
 +When using OOP, one can hide complexity while still being able to access everything.  %
 +
 +One of the most powerful patterns within OOP is \emph{inheritance}.  %
 +Inheritance is a special relationship between classes.  %
 +When a class (the child) is made to inherit from another class (the parent), all of the attributes
 +and methods of the parent come automatically.  %
 +The child class, then, can benefit from all of the behaviors enabled by its parent while still
 +maintaining its own identity where needed.
 +The inheritance pattern makes it very easy to cleanly define expectations and shared structure
 +throughout a large piece of software without repeating functionality.  %
 +
 +% TODO: more exposition on inheritance, perhaps including an example
 +
 +OOP is a deep subject with many patterns and concepts behind it.  %
 +There are many places to read further [CITES].
 +I recommend The Quarks of Object-Oriented Development, by \textcite{ArmstrongDeborahJ2006a}.  %
  \section{Hierarchical data format}  % -------------------------------------------------------------
 -FITS
 +One of the particularly important challenges in MR-CMDS is data storage.  %
 +MR-CMDS datasets are multi-dimensional, and the particular dimensions are different from experiment
 +to experiment.  %
 +Historically, the Wright Group has stored data as ``flattened'' arrays in plain text, where each
 +column corresponds to one of the scannable hardwares or one of the sensors in the experiment.  %
 +The simplicity and portability of these formats is fantastic, but they do not scale well with
 +increasingly large and higher-dimensional data.  %
 +
 +% TODO: justify further why flattening UTF8 files are bad idea
 +
 +Heirarchial data files are an alternative strategy that scales much better with large and
 +high-dimensional data.  %
 +
 +Originally, CDF \cite{TreinshLloydA1987a}.  %
 +Support ``random access to data, so that efficient access of small portions or large data files
 +would be possible''.  %
 -HDF5
 +Then, NetCDF \cite{RewRuss1990a}.
 +More portability.  %
 +Named dimensions.  %
 +Metadata.  %
 +``Hyperslab''
 -CDF (Common Data Format)
 +FITS used by astronomy community, with a focus on backwards compatibility.  %
 +\cite{WellsDC1981a}
 +% CITE https://fits.gsfc.nasa.gov/
 +% CONSIDER CITING https://fits.gsfc.nasa.gov/rfc4047.txt
 +
 +
 +I have chosen to build off of HDF5.  %
  \section{Scientific Python}  % --------------------------------------------------------------------
 -Numpy, SciPy
\ No newline at end of file +Numpy, SciPy
 +
 +% TODO: add MillmanKJarrod2011a (Python for Scientists and Engineers)
 +% TODO: add vanderWaltStefan2011a (The NumPy Array: A Structure for Efficient Numerical Computation)
\ No newline at end of file | 
