From f17a23c402dce796a0e7483d8a822eb6c874d489 Mon Sep 17 00:00:00 2001 From: Blaise Thompson Date: Tue, 3 Apr 2018 10:20:14 -0500 Subject: 2018-04-03 10:20 --- software/chapter.tex | 73 ++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 62 insertions(+), 11 deletions(-) (limited to 'software/chapter.tex') diff --git a/software/chapter.tex b/software/chapter.tex index 83fd207..2a6cce2 100644 --- a/software/chapter.tex +++ b/software/chapter.tex @@ -1,8 +1,4 @@ % TODO: add StoddenVictoria2016a (Enhancing reproducibility for computational methods) -% TODO: add MillmanKJarrod2011a (Python for Scientists and Engineers) -% TODO: add vanderWaltStefan2011a (The NumPy Array: A Structure for Efficient Numerical Computation) -% TODO: reference https://www.nsf.gov/pubs/2016/nsf16532/nsf16532.htm (Software Infrastructure for -% Sustained Innovation (SI2: SSE & SSI)) % TODO: http://pubs.acs.org/doi/10.1021/cen-09535-scitech2 \chapter{Software} @@ -59,7 +55,7 @@ This is in part due to the their general lack of formal training in programming development. \textcite{HannayJoErskine2009a} found that over 90\% of scientists learn software development through `informal self study', while \textcite{SegalJudith2004a} mentions that ``[scientists] do not describe themselves as software developers and have little formal education -or training in software development''. HannayJoErskine2009a agrees. +or training in software development''. HannayJoErskine2009a agrees. JoppaLucasN2013a aggrees. This lack of training is not in-and-of-itself a problem. % After all, academic scientists are required to be ``do-it-yourself''ers in many contexts for which @@ -95,6 +91,13 @@ Great software makes science easier, faster, and often of higher quality. % And making great software isn't necessarily harder than the development practices that scientists are following today---indeed sometimes it is easier to follow best practices. % +In the United States, funding agencies have recognized the crucial role that software plays in +science. % +The National Science Foundation has a long-running ``Software Infrastructure for Sustained +Innovation'' (SI$^2$) program, which endeavors to take a ``leadership role in providing software as +enabling infrastructure for science and engineering research'' [CITE https://www.nsf.gov/pubs/2012/nsf12113/nsf12113.pdf]. +% https://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503489 + \section{Challenges in scientific software development} % ======================================== Software development ``by-and-for'' scientists poses unique challenges. % @@ -229,7 +232,7 @@ Now I can make some instances of that class, and access their attributes and met >>> mary = Person(name='Mary', favorite_food='pizza', hated_food='falafel') >>> jane = Person(name='Jane', favorite_food='salad') >>> mary.react_to('falafel') -'gross---no thank you' +'gross---no thank you''' >>> jane.react_to('salad') 'yum! my favorite' >>> mary.favorite_food @@ -237,16 +240,64 @@ Now I can make some instances of that class, and access their attributes and met >>> jane.react_to(mary.favorite_food) 'meh' \end{codefragment} -To the knowledge of this author +We can already begin to see how powerful this approach is. % +Instances of \python{Person} contain their own attributes and methods. % +Instances can be interacted with in complex or simple ways. % +The attributes \python{favorite_food} and \python{hated_food} are fully accessible, but need not be +directly dealt with when using the \python{read_to} method. % +When using OOP, one can hide complexity while still being able to access everything. % + +One of the most powerful patterns within OOP is \emph{inheritance}. % +Inheritance is a special relationship between classes. % +When a class (the child) is made to inherit from another class (the parent), all of the attributes +and methods of the parent come automatically. % +The child class, then, can benefit from all of the behaviors enabled by its parent while still +maintaining its own identity where needed. +The inheritance pattern makes it very easy to cleanly define expectations and shared structure +throughout a large piece of software without repeating functionality. % + +% TODO: more exposition on inheritance, perhaps including an example + +OOP is a deep subject with many patterns and concepts behind it. % +There are many places to read further [CITES]. +I recommend The Quarks of Object-Oriented Development, by \textcite{ArmstrongDeborahJ2006a}. % \section{Hierarchical data format} % ------------------------------------------------------------- -FITS +One of the particularly important challenges in MR-CMDS is data storage. % +MR-CMDS datasets are multi-dimensional, and the particular dimensions are different from experiment +to experiment. % +Historically, the Wright Group has stored data as ``flattened'' arrays in plain text, where each +column corresponds to one of the scannable hardwares or one of the sensors in the experiment. % +The simplicity and portability of these formats is fantastic, but they do not scale well with +increasingly large and higher-dimensional data. % + +% TODO: justify further why flattening UTF8 files are bad idea + +Heirarchial data files are an alternative strategy that scales much better with large and +high-dimensional data. % + +Originally, CDF \cite{TreinshLloydA1987a}. % +Support ``random access to data, so that efficient access of small portions or large data files +would be possible''. % -HDF5 +Then, NetCDF \cite{RewRuss1990a}. +More portability. % +Named dimensions. % +Metadata. % +``Hyperslab'' -CDF (Common Data Format) +FITS used by astronomy community, with a focus on backwards compatibility. % +\cite{WellsDC1981a} +% CITE https://fits.gsfc.nasa.gov/ +% CONSIDER CITING https://fits.gsfc.nasa.gov/rfc4047.txt + + +I have chosen to build off of HDF5. % \section{Scientific Python} % -------------------------------------------------------------------- -Numpy, SciPy \ No newline at end of file +Numpy, SciPy + +% TODO: add MillmanKJarrod2011a (Python for Scientists and Engineers) +% TODO: add vanderWaltStefan2011a (The NumPy Array: A Structure for Efficient Numerical Computation) \ No newline at end of file -- cgit v1.2.3