Recently I was introduced to PHREEQC, which is a program that models chemical interactions between rocks and water (among many, many other things). It is an open-source project by the USGS, and has a number of powerful features, including modeling equilibrium concentrations of elements according to various input parameters (e.g. temperature, pH). My task for the day (a few weeks ago) was to see if changing partial pressure of CO2 could theoretically have any effect on Alkalinity in surface water. PHREEQC was the solution.
Installing and running PHREEQC is not trivial. There are binary distributions for Mac and Windows, but the method of running the program is somewhat archane: the user is required to provide an input file that looks something like this (this is a modified version of Example 2 in the Version 3 User’s Guide:
TITLE Example 2.--Temperature dependence of solubility of gypsum and anhydrite SOLUTION 1 Pure water pH 7.0 temp 25.0 EQUILIBRIUM_PHASES 1 Gypsum 0.0 1.0 Anhydrite 0.0 1.0 REACTION_TEMPERATURE 1 25.0 75.0 in 51 steps SELECTED_OUTPUT -file ex2.sel -temperature -si anhydrite gypsum END
The above code will calculate the saturation indicies of Gypsum and Anhydrite as they change with temperature (between 25 and 75 degrees), and save the results to a delimited file called
ex2.sel. It takes some sleuthing in the documentation (which is quite good and very complete) to figure that out. Somebody has also made a package for R that will install, run, and parse the results of PHREEQC to an R object.
library(phreeqc) phrLoadDatabaseString(phreeqc.dat) phrRunString(ex2) so <- phrGetSelectedOutput() head(so$n1)
ex2 is a character vector (one element per line) of the input file, which is included in the package as data. Similarly,
phreeqc.dat is a database that contains speciation and geochemical reaction information. There are a number of these databases included, which, being new to PHREEQC, I don’t fully grasp other than that they contain different species, elements, and reactions.
What I would like to type is quite a bit simpler than creating a character vector with a preformed input file. What if creating a solution could be as easy as
solution(pH=7)? I spent an afternoon with this, and ended up with a wrapper package that is called (preliminarily) easyphreeqc.
library(easyphreeqc) # generate input input <- solution(pH = 7, temp = 25) + equilibrium_phases(Gypsum = c(0, 1), Anhydrite = c(0, 1)) + reaction_temperature(low = 25, high = 75, steps = 51) + selected_output(temperature = TRUE, si = c('anhydrite', 'gypsum')) # run phreeqc() with a character vector as input output <- phreeqc(input) # output is the first selected output as a data frame head(output)
The idea is to turn each section of the input file into a function, so that specifying it is more “R-like”. If you have used ggplot2, you’ll notice the syntax is very similar. Of course, you can use regular old
c() to combine the output of
solution() and related functions (they just return, for now, the equivalent character vector that you would read from a file specifying the same solution).
With simplification comes the loss of flexibility, but many of the components of PHREEQC that require that flexibility are features that are more convenient to implement in R (e.g. vectorization of input values such as temperature, or plotting output). Realistically, functions like
solution() should have formal R arguments, rather than just pasting the input together line-by-line. In any case, it’s unlikely I’ll get back to this side-project until after field season, but if you are interested in contributing, the GitHub repo has the source code available. You can install the package using