Random change in obj_addr() output when including the objects into a list and vectorizing over them

There is a strange behavior of lobstr::obj_addr caused by its vectorization over lists, when the list itself doesn’t change the address

I just started Advanced R by Wickham (2ed) and reached the 2.2.2 Exercises first exercise. I supposed that, given:

a <- 1:10; b <- a; c <- b

all of them would have the same memory address as retrieved by the lobstr::obj_addr function. That is true if we just use a, b or c as inputs, but as I am lazy and wanted to have all the values at once, I did:

list(a, b, c) |> lapply(obj_addr) # lapply or sapply 

Then we obtain a different set of values among the different names every time the function is run. That still happens if we set x <- list(a, b, c) before calling the function through lapply, and obj_addr(x[[1]]) == obj_addr(x[[2]]) == obj_addr(x[[3]]) == obj_addr(a), so it’s not a matter of creating a new list every time. Does someone know what is going on here? I understand that to a certain point each call generates a new output object that will have its own memory address, but I don’t know how lapply can interfere with a constant function for a given object like obj_addr.

Thank you in advance!

New contributor

mysteriarcha is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

2

This is caused by a bug in how lobstr identifies the environment

This error arises from the way lobstr uses rlang quosure tools to access the object without increasing the reference count. The purpose of this is to allow garbage collection to happen properly later, by ensuring there are no references to the object hanging around. However, in the case of lapply(x, lobstr::obj_addr), it does not correctly access the environment of the elements of x.

What does lobstr::obj_addr() do?

The (slightly simplified) source is as follows:

obj_addr <- function(x) {
  x <- enquo(x)
  obj_addr_(quo_get_expr(x), quo_get_env(x))
}

obj_addr_() is the C function that gets the memory address. However, first obj_addr() defuses the expression, so it can refer to it without increasing the reference count.

What happens with defusing in lapply()?

Consider the following function to get the environment of an object that a function is called from:

f <- function(x) {
    rlang::quo_get_env(rlang::enquo(x))
}

We can call this from lapply() in both ways:

x <- list(a = 1, b = 2, c = 3)

lapply(x, (y) f(y)) # anonymous function
# $a
# <environment: 0x557c265eb438>

# $b
# <environment: 0x557c265ea910>

# $c
# <environment: 0x557c26e275c8>

lapply(x, f) # function provided directly
# $a
# <environment: R_EmptyEnv>

# $b
# <environment: R_EmptyEnv>

# $c
# <environment: R_EmptyEnv>

The first set of results make sense. lapply() creates a temporary environment each time it calls the anonymous function (the function closure).

However, it does not make sense that lapply(x, f) is running in the empty environment. We know we can refer to objects in the global environment with lapply(). But the empty environment by definition contains no objects and has no parent:

f_parent <- function(x) {
    e <- rlang::quo_get_env(enquo(x))
    message("Objects in environment: ", length(ls(e)))
    message("Parent environment: ", parent.env(e))
}

lapply(x, f_parent)
# Objects in environment: 0
# Error in parent.env(e) : the empty environment has no parent

So rlang::quo_get_env(rlang::enquo(x)) clearly returns the wrong environment. Let’s try finding the parent environment of the function called by lapply() using base R:

f2 <- function(x) {
    parent.env(environment())
}
lapply(x, f2)
# $a
# <environment: R_GlobalEnv>

# $b
# <environment: R_GlobalEnv>

# $c
# <environment: R_GlobalEnv>

This makes more sense and gives us a clue as to what is going on.

Writing our own function to get the pointer

To rule out lapply() as the source of this inconsistency, let’s write out own version of lobstr::obj_addr() that doesn’t mess around with environments. The relevant line of the C-level obj_addr_() function is where it casts the SEXP to a pointer:

static_cast<void *>(x);

Here is a similar function to get the pointer which skips the rlang stuff:

get_pointer <- inline::cfunction(
    sig = c(x = "integer"),
    body = '
    // cast SEXP to a void pointer like lobstr
    void* ptr = (void*) x;

    // put the pointer in a character array
    char addr_chr[32];
    snprintf(addr_chr, sizeof(addr_chr), "%p", ptr);

    // put address in character vector and return it
    SEXP addr = PROTECT(allocVector(STRSXP, 1));
    SET_STRING_ELT(addr, 0, mkChar(addr_chr));
    UNPROTECT(1);
    return addr;',
    includes = "#include <stdio.h>"
)

Comparing get_pointer() to lobstr::obj_addr()

Let’s define x and check the addresses individually:

x <- list(a = 1, b = 2, c = 3)

lobstr::obj_addr(x[[1]]) # [1] "0x557c22e8eb28"
lobstr::obj_addr(x[[2]]) # [1] "0x557c22e8eb60"
lobstr::obj_addr(x[[3]]) # [1] "0x557c22e8eb98"

We can now compare the results using lobstr in three ways. I’ll use sapply() instead of lapply() as it prints more nicely. We can see that sapply(x, lobstr::obj_addr) is not correct.

lobstr::obj_addrs(x) # correct
# [1] "0x557c22e8eb28" "0x557c22e8eb60" "0x557c22e8eb98"

sapply(x, (y) lobstr::obj_addr(y)) # correct
#                a                b                c
# "0x557c22e8eb28" "0x557c22e8eb60" "0x557c22e8eb98"

sapply(x, lobstr::obj_addr) # incorrect
#                a                b                c
# "0x557c24fdd7a0" "0x557c24fdd8f0" "0x557c24fdda78"

The question is whether we can get the correct results if we skip the environment stuff. This is where we can use get_pointer():

sapply(x, (y) get_pointer(y)) # correct
#                a                b                c 
# "0x557c22e8eb28" "0x557c22e8eb60" "0x557c22e8eb98"

sapply(x, get_pointer) # correct
#                a                b                c 
# "0x557c22e8eb28" "0x557c22e8eb60" "0x557c22e8eb98"

So get_pointer() gets the correct results both times. This indicates that the issue is with lobstr‘s use of rlang quosure tools. I am not actually sure whether this is an rlang issue, or whether the problem is how lobstr is using rlang. However, as both packages are part of r-lib, I imagine that a bug report filed to either would find its way to the right place pretty quickly. However, it’s not clear to me how this issue could be resolved while also not increasing the reference count of objects when they are inspected.

7

It seems that this behaviour is due to the fact that obj_addr is enquoing its argument before retrieving its address (see the function definition). So, we can examine the behaviour of enquo separately from the actual address-retrieving part.

For checking, we can define a function based on obj_addr that retrieves the address as:

.internal.address = function(.) lobstr:::obj_addr_(rlang::quo_get_expr(.), 
                                                   rlang::quo_get_env(.))

For reference, the object a is located at:

.Internal(inspect(a))
#@58398aa88108 13 INTSXP g1c0 [MARK,REF(65535)]  1 : 10 (expanded)

When outside the body of a closure, enquo returns something of no use in order to retrieve the address of the object, since enquo seems to be able to evaluate a symbol completely (in contrast to substitute for example) and return a newly constructed object:

enquo(a)
#<quosure>
#expr: ^<int: 1L, 2L, 3L, 4L, 5L, ...>
#env:  empty
.internal.address(enquo(a))
#[1] "0x5839a29c0018"
.internal.address(enquo(a))
#[1] "0x5839a29c0088"

Inside a closure, though, everything works as expected:

f1 = function(x) enquo(x)
f1(a)
#<quosure>
#expr: ^a
#env:  global
.internal.address(f1(a))
#[1] "0x58398aa88108"
.internal.address(f1(a))
#[1] "0x58398aa88108"

lapply evaluates its arguments during constructing the call and, probably, we can simulate its behaviour with force (to make it clear) like:

force1 = function(x) { force(x); enquo(x)}
force1(a)
#<quosure>
#expr: ^<int: 1L, 2L, 3L, 4L, 5L, ...>
#env:  empty
.internal.address(force1(a))
#[1] "0x5839a2816848"
.internal.address(force1(a))
#[1] "0x5839a2816b58"

since enquo does not return a searchable symbol anymore.

As MrFlick notes in the comments, wrapping with another layer of closure works as expected since enquo does not seem to reach a full evaluation of its argument:

force2 = function(y) { force(y); f1(y)}
force2(a)
#<quosure>
#expr: ^y
#env:  0x58399eef63d8
.internal.address(force2(a))
#[1] "0x58398aa88108"

Additionally, as an example, if we redefine obj_addr along the lines of:

obj_addr2 = function(x) lobstr:::obj_addr_(substitute(x), parent.frame())

then lapply does not cause confusion:

obj_addr(a)
#[1] "0x58398aa88108"
obj_addr2(a)
#[1] "0x58398aa88108"
lapply(list(a, b), obj_addr)
#[[1]]
#[1] "0x58399a991748"
#
#[[2]]
#[1] "0x58399a9917b8"

lapply(list(a, b), obj_addr2)
#[[1]]
#[1] "0x58398aa88108"
#
#[[2]]
#[1] "0x58398aa88108"

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa

Random change in obj_addr() output when including the objects into a list and vectorizing over them

There is a strange behavior of lobstr::obj_addr caused by its vectorization over lists, when the list itself doesn’t change the address

I just started Advanced R by Wickham (2ed) and reached the 2.2.2 Exercises first exercise. I supposed that, given:

a <- 1:10; b <- a; c <- b

all of them would have the same memory address as retrieved by the lobstr::obj_addr function. That is true if we just use a, b or c as inputs, but as I am lazy and wanted to have all the values at once, I did:

list(a, b, c) |> lapply(obj_addr) # lapply or sapply 

Then we obtain a different set of values among the different names every time the function is run. That still happens if we set x <- list(a, b, c) before calling the function through lapply, and obj_addr(x[[1]]) == obj_addr(x[[2]]) == obj_addr(x[[3]]) == obj_addr(a), so it’s not a matter of creating a new list every time. Does someone know what is going on here? I understand that to a certain point each call generates a new output object that will have its own memory address, but I don’t know how lapply can interfere with a constant function for a given object like obj_addr.

Thank you in advance!

New contributor

mysteriarcha is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

2

This is caused by a bug in how lobstr identifies the environment

This error arises from the way lobstr uses rlang quosure tools to access the object without increasing the reference count. The purpose of this is to allow garbage collection to happen properly later, by ensuring there are no references to the object hanging around. However, in the case of lapply(x, lobstr::obj_addr), it does not correctly access the environment of the elements of x.

What does lobstr::obj_addr() do?

The (slightly simplified) source is as follows:

obj_addr <- function(x) {
  x <- enquo(x)
  obj_addr_(quo_get_expr(x), quo_get_env(x))
}

obj_addr_() is the C function that gets the memory address. However, first obj_addr() defuses the expression, so it can refer to it without increasing the reference count.

What happens with defusing in lapply()?

Consider the following function to get the environment of an object that a function is called from:

f <- function(x) {
    rlang::quo_get_env(rlang::enquo(x))
}

We can call this from lapply() in both ways:

x <- list(a = 1, b = 2, c = 3)

lapply(x, (y) f(y)) # anonymous function
# $a
# <environment: 0x557c265eb438>

# $b
# <environment: 0x557c265ea910>

# $c
# <environment: 0x557c26e275c8>

lapply(x, f) # function provided directly
# $a
# <environment: R_EmptyEnv>

# $b
# <environment: R_EmptyEnv>

# $c
# <environment: R_EmptyEnv>

The first set of results make sense. lapply() creates a temporary environment each time it calls the anonymous function (the function closure).

However, it does not make sense that lapply(x, f) is running in the empty environment. We know we can refer to objects in the global environment with lapply(). But the empty environment by definition contains no objects and has no parent:

f_parent <- function(x) {
    e <- rlang::quo_get_env(enquo(x))
    message("Objects in environment: ", length(ls(e)))
    message("Parent environment: ", parent.env(e))
}

lapply(x, f_parent)
# Objects in environment: 0
# Error in parent.env(e) : the empty environment has no parent

So rlang::quo_get_env(rlang::enquo(x)) clearly returns the wrong environment. Let’s try finding the parent environment of the function called by lapply() using base R:

f2 <- function(x) {
    parent.env(environment())
}
lapply(x, f2)
# $a
# <environment: R_GlobalEnv>

# $b
# <environment: R_GlobalEnv>

# $c
# <environment: R_GlobalEnv>

This makes more sense and gives us a clue as to what is going on.

Writing our own function to get the pointer

To rule out lapply() as the source of this inconsistency, let’s write out own version of lobstr::obj_addr() that doesn’t mess around with environments. The relevant line of the C-level obj_addr_() function is where it casts the SEXP to a pointer:

static_cast<void *>(x);

Here is a similar function to get the pointer which skips the rlang stuff:

get_pointer <- inline::cfunction(
    sig = c(x = "integer"),
    body = '
    // cast SEXP to a void pointer like lobstr
    void* ptr = (void*) x;

    // put the pointer in a character array
    char addr_chr[32];
    snprintf(addr_chr, sizeof(addr_chr), "%p", ptr);

    // put address in character vector and return it
    SEXP addr = PROTECT(allocVector(STRSXP, 1));
    SET_STRING_ELT(addr, 0, mkChar(addr_chr));
    UNPROTECT(1);
    return addr;',
    includes = "#include <stdio.h>"
)

Comparing get_pointer() to lobstr::obj_addr()

Let’s define x and check the addresses individually:

x <- list(a = 1, b = 2, c = 3)

lobstr::obj_addr(x[[1]]) # [1] "0x557c22e8eb28"
lobstr::obj_addr(x[[2]]) # [1] "0x557c22e8eb60"
lobstr::obj_addr(x[[3]]) # [1] "0x557c22e8eb98"

We can now compare the results using lobstr in three ways. I’ll use sapply() instead of lapply() as it prints more nicely. We can see that sapply(x, lobstr::obj_addr) is not correct.

lobstr::obj_addrs(x) # correct
# [1] "0x557c22e8eb28" "0x557c22e8eb60" "0x557c22e8eb98"

sapply(x, (y) lobstr::obj_addr(y)) # correct
#                a                b                c
# "0x557c22e8eb28" "0x557c22e8eb60" "0x557c22e8eb98"

sapply(x, lobstr::obj_addr) # incorrect
#                a                b                c
# "0x557c24fdd7a0" "0x557c24fdd8f0" "0x557c24fdda78"

The question is whether we can get the correct results if we skip the environment stuff. This is where we can use get_pointer():

sapply(x, (y) get_pointer(y)) # correct
#                a                b                c 
# "0x557c22e8eb28" "0x557c22e8eb60" "0x557c22e8eb98"

sapply(x, get_pointer) # correct
#                a                b                c 
# "0x557c22e8eb28" "0x557c22e8eb60" "0x557c22e8eb98"

So get_pointer() gets the correct results both times. This indicates that the issue is with lobstr‘s use of rlang quosure tools. I am not actually sure whether this is an rlang issue, or whether the problem is how lobstr is using rlang. However, as both packages are part of r-lib, I imagine that a bug report filed to either would find its way to the right place pretty quickly. However, it’s not clear to me how this issue could be resolved while also not increasing the reference count of objects when they are inspected.

7

It seems that this behaviour is due to the fact that obj_addr is enquoing its argument before retrieving its address (see the function definition). So, we can examine the behaviour of enquo separately from the actual address-retrieving part.

For checking, we can define a function based on obj_addr that retrieves the address as:

.internal.address = function(.) lobstr:::obj_addr_(rlang::quo_get_expr(.), 
                                                   rlang::quo_get_env(.))

For reference, the object a is located at:

.Internal(inspect(a))
#@58398aa88108 13 INTSXP g1c0 [MARK,REF(65535)]  1 : 10 (expanded)

When outside the body of a closure, enquo returns something of no use in order to retrieve the address of the object, since enquo seems to be able to evaluate a symbol completely (in contrast to substitute for example) and return a newly constructed object:

enquo(a)
#<quosure>
#expr: ^<int: 1L, 2L, 3L, 4L, 5L, ...>
#env:  empty
.internal.address(enquo(a))
#[1] "0x5839a29c0018"
.internal.address(enquo(a))
#[1] "0x5839a29c0088"

Inside a closure, though, everything works as expected:

f1 = function(x) enquo(x)
f1(a)
#<quosure>
#expr: ^a
#env:  global
.internal.address(f1(a))
#[1] "0x58398aa88108"
.internal.address(f1(a))
#[1] "0x58398aa88108"

lapply evaluates its arguments during constructing the call and, probably, we can simulate its behaviour with force (to make it clear) like:

force1 = function(x) { force(x); enquo(x)}
force1(a)
#<quosure>
#expr: ^<int: 1L, 2L, 3L, 4L, 5L, ...>
#env:  empty
.internal.address(force1(a))
#[1] "0x5839a2816848"
.internal.address(force1(a))
#[1] "0x5839a2816b58"

since enquo does not return a searchable symbol anymore.

As MrFlick notes in the comments, wrapping with another layer of closure works as expected since enquo does not seem to reach a full evaluation of its argument:

force2 = function(y) { force(y); f1(y)}
force2(a)
#<quosure>
#expr: ^y
#env:  0x58399eef63d8
.internal.address(force2(a))
#[1] "0x58398aa88108"

Additionally, as an example, if we redefine obj_addr along the lines of:

obj_addr2 = function(x) lobstr:::obj_addr_(substitute(x), parent.frame())

then lapply does not cause confusion:

obj_addr(a)
#[1] "0x58398aa88108"
obj_addr2(a)
#[1] "0x58398aa88108"
lapply(list(a, b), obj_addr)
#[[1]]
#[1] "0x58399a991748"
#
#[[2]]
#[1] "0x58399a9917b8"

lapply(list(a, b), obj_addr2)
#[[1]]
#[1] "0x58398aa88108"
#
#[[2]]
#[1] "0x58398aa88108"

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật