I have the below sample code which loop for 50 Million time and create objects of MyDataHolder
. MyDataHolder
has two variants one is hashmap based and the other is variable based. see (MyDataHolder_map
and MyDataHolder_variable
)
<code>public class MemTest {
public static void main(String[] args) {
List<MyDataHolder_map> list = new LinkedList<>();
for (int i = 0; i < 50_000_000; i++) {
if(i % 100_000==0) System.out.println(i);
MyDataHolder_map d = new MyDataHolder_map();
d.add("key1", "value1"+i);
d.add("key2", "value2"+i);
d.add("key3", "value3"+i);
d.add("key4", "value4"+i);
d.add("key5", "value5"+i);
d.add("key6", "value6"+i);
list.add(d);
}
}
}
// use Map to store data
class MyDataHolder_map {
private Map<String, String> data = new HashMap<>(6);
public void add(String key, String value) {
data.put(key, value);
}
}
// use variables
class MyDataHolder_variable {
private String key1;
private String key2;
private String key3;
private String key4;
private String key5;
private String key6;
public void add(String key, String value) {
switch (key) {
case "key1" -> key1 = value;
case "key2" -> key2 = value;
case "key3" -> key3 = value;
case "key4" -> key4 = value;
case "key5" -> key5 = value;
case "key6" -> key6 = value;
}
}
}
</code>
<code>public class MemTest {
public static void main(String[] args) {
List<MyDataHolder_map> list = new LinkedList<>();
for (int i = 0; i < 50_000_000; i++) {
if(i % 100_000==0) System.out.println(i);
MyDataHolder_map d = new MyDataHolder_map();
d.add("key1", "value1"+i);
d.add("key2", "value2"+i);
d.add("key3", "value3"+i);
d.add("key4", "value4"+i);
d.add("key5", "value5"+i);
d.add("key6", "value6"+i);
list.add(d);
}
}
}
// use Map to store data
class MyDataHolder_map {
private Map<String, String> data = new HashMap<>(6);
public void add(String key, String value) {
data.put(key, value);
}
}
// use variables
class MyDataHolder_variable {
private String key1;
private String key2;
private String key3;
private String key4;
private String key5;
private String key6;
public void add(String key, String value) {
switch (key) {
case "key1" -> key1 = value;
case "key2" -> key2 = value;
case "key3" -> key3 = value;
case "key4" -> key4 = value;
case "key5" -> key5 = value;
case "key6" -> key6 = value;
}
}
}
</code>
public class MemTest {
public static void main(String[] args) {
List<MyDataHolder_map> list = new LinkedList<>();
for (int i = 0; i < 50_000_000; i++) {
if(i % 100_000==0) System.out.println(i);
MyDataHolder_map d = new MyDataHolder_map();
d.add("key1", "value1"+i);
d.add("key2", "value2"+i);
d.add("key3", "value3"+i);
d.add("key4", "value4"+i);
d.add("key5", "value5"+i);
d.add("key6", "value6"+i);
list.add(d);
}
}
}
// use Map to store data
class MyDataHolder_map {
private Map<String, String> data = new HashMap<>(6);
public void add(String key, String value) {
data.put(key, value);
}
}
// use variables
class MyDataHolder_variable {
private String key1;
private String key2;
private String key3;
private String key4;
private String key5;
private String key6;
public void add(String key, String value) {
switch (key) {
case "key1" -> key1 = value;
case "key2" -> key2 = value;
case "key3" -> key3 = value;
case "key4" -> key4 = value;
case "key5" -> key5 = value;
case "key6" -> key6 = value;
}
}
}
I run the code with 32 GB ram -Xmx32g -verbose:gc
and notice that the variable based MyDataHolder
print following at the end of the program:
<code>With Variable based MyDataHolder
[32.495s][info][gc] GC(34) Pause Young (Normal) (G1 Evacuation Pause) 24032M->24072M(31312M) 1259.276ms
[35.537s][info][gc] GC(27) Pause Remark 24128M->24128M(32768M) 2.067ms
[59.101s][info][gc] GC(27) Pause Cleanup 24128M->24128M(32768M) 1.132ms
[59.234s][info][gc] GC(27) Concurrent Mark Cycle 39850.818ms
</code>
<code>With Variable based MyDataHolder
[32.495s][info][gc] GC(34) Pause Young (Normal) (G1 Evacuation Pause) 24032M->24072M(31312M) 1259.276ms
[35.537s][info][gc] GC(27) Pause Remark 24128M->24128M(32768M) 2.067ms
[59.101s][info][gc] GC(27) Pause Cleanup 24128M->24128M(32768M) 1.132ms
[59.234s][info][gc] GC(27) Concurrent Mark Cycle 39850.818ms
</code>
With Variable based MyDataHolder
[32.495s][info][gc] GC(34) Pause Young (Normal) (G1 Evacuation Pause) 24032M->24072M(31312M) 1259.276ms
[35.537s][info][gc] GC(27) Pause Remark 24128M->24128M(32768M) 2.067ms
[59.101s][info][gc] GC(27) Pause Cleanup 24128M->24128M(32768M) 1.132ms
[59.234s][info][gc] GC(27) Concurrent Mark Cycle 39850.818ms
Notice it took 24128M i.e 24 GB RAM.However with Hashmap based “MyDataHolder` fails and it print
<code>[81.370s][info][gc] GC(73) Pause Young (Normal) (G1 Preventive Collection) 32581M->32583M(32768M) 42.816ms
[81.458s][info][gc] GC(74) To-space exhausted
[81.458s][info][gc] GC(74) Pause Young (Concurrent Start) (G1 Preventive Collection) 32615M->32647M(32768M) 82.420ms
[81.458s][info][gc] GC(75) Concurrent Mark Cycle
[81.519s][info][gc] GC(76) To-space exhausted
[81.519s][info][gc] GC(76) Pause Young (Normal) (G1 Preventive Collection) 32663M->32679M(32768M) 52.343ms
</code>
<code>[81.370s][info][gc] GC(73) Pause Young (Normal) (G1 Preventive Collection) 32581M->32583M(32768M) 42.816ms
[81.458s][info][gc] GC(74) To-space exhausted
[81.458s][info][gc] GC(74) Pause Young (Concurrent Start) (G1 Preventive Collection) 32615M->32647M(32768M) 82.420ms
[81.458s][info][gc] GC(75) Concurrent Mark Cycle
[81.519s][info][gc] GC(76) To-space exhausted
[81.519s][info][gc] GC(76) Pause Young (Normal) (G1 Preventive Collection) 32663M->32679M(32768M) 52.343ms
</code>
[81.370s][info][gc] GC(73) Pause Young (Normal) (G1 Preventive Collection) 32581M->32583M(32768M) 42.816ms
[81.458s][info][gc] GC(74) To-space exhausted
[81.458s][info][gc] GC(74) Pause Young (Concurrent Start) (G1 Preventive Collection) 32615M->32647M(32768M) 82.420ms
[81.458s][info][gc] GC(75) Concurrent Mark Cycle
[81.519s][info][gc] GC(76) To-space exhausted
[81.519s][info][gc] GC(76) Pause Young (Normal) (G1 Preventive Collection) 32663M->32679M(32768M) 52.343ms
Notice it consumed all 32 GB RAM.
Question:
- Why the program is taking 24GB with variable version , it should be way less as total number of unique string in VM are 6 Keys per iterator i.e 6*50M= 300M and each string is 7 byte so total is 300M * 7Byte = 1.9 GB. I understand there are overhead of references and objects but that cant be 20+ GBs. So what is going on?
- Why Hashmap based “MyDataHolder` need more space 32 GB as compared to variable based. Does HashMap have very high mem overheads?
PS: using java17