I am running some electrochemical corrosion tests and trying to fit a Tafel plot like this
to the Butler-Volmer Equation, where I is the x-axis data and E is the y-axis data.
Because the BV equation and the data start to deviate at higher x values, I am trying to find a way to algorithmically determine the optimum data range to fit. I’m looking for the largest range of data that gives me the smallest residuals. (I’m dividing the sum of squared residuals by the square root of the number of data points to try and bias things towards larger data ranges, but am open to trying other fitness criteria.)
Currently, I’m splitting the data into top and bottom curves, selecting a region from each curve to fit (starting from the minimum x value and moving to the right), then reassembling the curves and running an NLS fit. I’m using a while loop to grow the regions I’m fiting, but it takes a loooooong time. Plus, it grows the range symmetrically, and I might get better fits if the regions were asymmetrical.
Can anyone recommend a faster/better method?
Simplified code:
library(dplyr)
library(minpack.lm) #for nlsLM
my_data #data frame with 2995 rows and 2 columns
#contains my_data$x, my_data$y shown in the plot above
my_min #data frame with 1 row and 2 columns
#my_min$x is the minimum x value; my_min$y is the corresponding y value
#split the curves
top_curve <- #data frame with 782 rows and 2 columns
my_data %>%
filter(y > my_min$y) %>%
arrange(x)
bottom_curve <- #data frame with 2212 rows and 2 columns
my_data %>%
filter(y < my_min$y) %>%
arrange(x)
top_n <- nrow(top_curve)
bottom_n <- nrow(bottom_curve)
#start with fitting 50 points from the top and bottom curves
top_range <- 50
bottom_range <- 50
fit_list <- list() #save fits here
iteration <- 1
while((top_range < top_n) & (bottom_range < bottom_n)){
#make sure the ranges don't ever get bigger than the data
if(top_range > top_n) top_range <- top_n
if(bottom_range > bottom_n) bottom_range <- bottom_n
#get the first n values of the top curve
top_curve_selection <- slice_head(top_curve, n=top_range)
#get the first n values of the bottom curve
bottom_curve_selection <- slice_head(bottom_curve, n=bottom_range)
#recombine curves
curve_selection <- bind_rows(top_curve_selection,
bottom_curve_selection,
my_min)
#fit the BV equation and save to list item
#this is actually wrapped in tryCatch to ignore singular gradient errors
#but I left that out to simplify things
#I used nlsLM because it seems to have fewer issues with singular gradients
fit_list[[iterator]] <- nlsLM(
x~i_corr*(exp(2.303*(y-E_corr)/Ba)-exp(-2.303*(y-E_corr)/Bc)),
data=curve_selection,
start=list(i_corr=1e-6, Ba=0.12, Bc=0.12, E_corr=my_min$y))
#expand ranges and increment iterator
top_range <- top_range + 1
bottom_range <- bottom_range + 1
iterator <- iterator + 1
}