I have notebooks that contain r code. On their own, they work fine when run manually. To schedule and automate these workbooks, we have to use pipelines to call the r notebooks. However, pipelines don’t allow inline install of the r library packages. So, we have to install those packages on the Apache spark pool. I was able to do this successfully for all of the packages and their dependencies. I’ve run into a problem with the HydroVuR.gz package (This also happens to be our 1 custom package. hmmm??). While I have been able to add all of the packages to the Apache Spark Pool, when I run the notebook, I get an error specific to only our Hydrovu package.
You can see below that the HydroVuR.tar.gz package is there on the SparkPool. However, I get a HydroVu specific error when I try to call it.
Since these packages are installed onto the pool and no longer Inline… I comment out the install.packages lines and just run the libraries
[1] "Error in library(HydroVuR): there is no package called ‘HydroVuR’"
The STDOUT when I installed it on the pool looks like it was successful?
Is something wrong with my HydroVuR.tar.gz package? Why can’t the pool see it?
Dependency packages and versions (Downloaded from CRAN):
1
Error:
[1] “Error in library(HydroVuR): there is no package called
‘HydroVuR’”
As you mentioned you are getting ERROR when using the
Install_github("ROSSSyndicate/HydroVuR")
Install HydroVuR from GitHub using devtools
Devtools::install_github("ROSSSyndicate/HydroVuR")
I have tried the below approach:
install.packages("devtools")
devtools::install_github("jvandens/HydroVuR")
Results:
packageVersion("HydroVuR")
‘0.0.0.9000’
The downloaded source packages are in
‘/tmp/RtmpbueDIx/downloaded_packages’
Downloading GitHub repo jvandens/HydroVuR@HEAD
cpp11 (0.4.7 -> 0.5.0) [CRAN]
curl (5.2.1 -> 5.2.2) [CRAN]
httr2 (1.0.2 -> 1.0.3) [CRAN]
Installing 3 packages: cpp11, curl, httr2
Installing packages into ‘/nfs4/R/user-lib/application_1726032047530_0001’
(as ‘lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/cpp11_0.5.0.tar.gz'
Content type 'application/x-gzip' length 275693 bytes (269 KB)
==================================================
downloaded 269 KB
library(HydroVuR)
sessionInfo()
Reference: jvandens/HydroVuR
9
I think I found the problem. My .tar.gz package had no version number. Once I added HydroVuR_0.0.0.9000.tar.gz to the file name of the package, I was able to install it and see it. Then, I was able to run my pipeline to success. I think the problem is now solved. Thanks for your help Dileep.