So like said in the title, I’m trying to partition a graph using METIS. More precisely, I have a mesh that has 297 elements, so those are the 297 nodes of the graph and I want to partition it into 64 parts for distributing the workload on 64 different MPI processes. So the nodes/elements are weighted by their computational cost, and for simplicity one can say it is the number of nodes within each element.
Here is an adaptation of my block of code from inside my function that is a reproducible example :
#include<vector>
#include "metis.h"
int main(int* argc, char** argv) {
size_t nME = 297;
std::vector<idx_t> xadj = {0, 3, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 49, 52, 56, 60, 65, 70, 75, 80, 85, 90, 94, 98, 102, 106, 111, 116, 121, 126, 131, 136, 140, 144, 148, 152, 157, 162, 167, 172, 177, 182, 187, 192, 197, 202, 206, 210, 214, 218, 223, 228, 233, 238, 243, 248, 252, 256, 260, 264, 269, 274, 279, 284, 289, 294, 298, 302, 306, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, 364, 368, 372, 377, 382, 388, 393, 399, 404, 410, 415, 421, 426, 432, 436, 441, 445, 449, 454, 459, 464, 469, 474, 479, 484, 489, 494, 499, 503, 507, 511, 516, 521, 527, 532, 538, 543, 549, 554, 560, 565, 571, 575, 580, 583, 586, 590, 594, 598, 602, 606, 610, 614, 618, 622, 626, 629, 632, 635, 639, 643, 647, 651, 655, 658, 661, 665, 669, 673, 677, 681, 684, 687, 691, 695, 699, 703, 707, 710, 713, 717, 721, 725, 729, 733, 736, 739, 743, 747, 751, 755, 759, 762, 765, 769, 773, 777, 781, 785, 788, 791, 795, 799, 803, 807, 811, 814, 817, 821, 825, 829, 833, 837, 840, 843, 847, 851, 855, 859, 863, 866, 871, 876, 881, 886, 891, 896, 901, 906, 911, 916, 921, 926, 931, 936, 941, 946, 951, 956, 961, 966, 971, 976, 981, 986, 991, 996, 1001, 1006, 1011, 1016, 1021, 1026, 1031, 1036, 1041, 1046, 1051, 1056, 1061, 1066, 1071, 1076, 1081, 1086, 1091, 1096, 1101, 1106, 1111, 1116, 1121, 1126, 1131, 1136, 1141, 1146, 1150, 1154, 1158, 1162, 1166, 1170, 1174, 1178, 1182, 1186, 1190, 1194, 1199, 1204, 1209, 1214, 1219, 1224, 1229, 1234, 1238, 1242, 1246, 1250, 1254, 1258, 1262, 1266, 1270, 1274, 1278, 1282, 1287, 1292, 1297, 1302, 1307, 1312, 1317, 1322};
std::vector<idx_t> adjncy = {1, 2, 14, 0, 3, 15, 0, 3, 4, 289, 1, 2, 5, 290, 2, 5, 6, 273, 3, 4, 7, 274, 4, 7, 8, 16, 5, 6, 9, 17, 6, 9, 10, 18, 7, 8, 11, 19, 8, 11, 12, 20, 9, 10, 13, 21, 10, 13, 22, 11, 12, 23, 0, 15, 291, 24, 1, 14, 292, 25, 6, 17, 275, 18, 26, 7, 16, 276, 19, 27, 8, 16, 19, 20, 28, 9, 17, 18, 21, 29, 10, 18, 21, 22, 30, 11, 19, 20, 23, 31, 12, 20, 23, 32, 13, 21, 22, 33, 14, 25, 295, 34, 15, 24, 296, 35, 16, 27, 271, 28, 40, 17, 26, 272, 29, 41, 18, 26, 29, 30, 42, 19, 27, 28, 31, 43, 20, 28, 31, 32, 44, 21, 29, 30, 33, 45, 22, 30, 33, 46, 23, 31, 32, 47, 24, 35, 36, 48, 25, 34, 37, 49, 34, 37, 38, 293, 50, 35, 36, 39, 294, 51, 36, 39, 40, 269, 52, 37, 38, 41, 270, 53, 26, 38, 41, 42, 54, 27, 39, 40, 43, 55, 28, 40, 43, 44, 249, 29, 41, 42, 45, 250, 30, 42, 45, 46, 225, 31, 43, 44, 47, 226, 32, 44, 47, 56, 33, 45, 46, 57, 34, 49, 50, 58, 35, 48, 51, 59, 36, 48, 51, 52, 60, 37, 49, 50, 53, 61, 38, 50, 53, 54, 62, 39, 51, 52, 55, 63, 40, 52, 55, 251, 64, 41, 53, 54, 252, 65, 46, 57, 227, 66, 47, 56, 228, 67, 48, 59, 60, 68, 49, 58, 61, 69, 50, 58, 61, 62, 70, 51, 59, 60, 63, 71, 52, 60, 63, 64, 72, 53, 61, 62, 65, 73, 54, 62, 65, 255, 74, 55, 63, 64, 256, 75, 56, 67, 223, 80, 57, 66, 224, 81, 58, 69, 70, 82, 59, 68, 71, 83, 60, 68, 71, 72, 84, 61, 69, 70, 73, 85, 62, 70, 73, 74, 86, 63, 71, 72, 75, 87, 64, 72, 75, 76, 88, 65, 73, 74, 77, 89, 74, 77, 78, 253, 90, 75, 76, 79, 254, 91, 76, 79, 80, 221, 92, 77, 78, 81, 222, 93, 66, 78, 81, 94, 67, 79, 80, 95, 68, 83, 84, 96, 69, 82, 138, 85, 97, 70, 82, 85, 86, 98, 71, 83, 84, 139, 87, 99, 72, 84, 87, 88, 100, 73, 85, 86, 140, 89, 101, 74, 86, 89, 90, 102, 75, 87, 88, 141, 91, 103, 76, 88, 91, 92, 104, 77, 89, 90, 142, 93, 105, 78, 90, 93, 94, 106, 79, 91, 92, 143, 95, 107, 80, 92, 95, 108, 81, 93, 94, 144, 109, 82, 97, 98, 110, 83, 96, 99, 111, 84, 96, 99, 100, 112, 85, 97, 98, 101, 113, 86, 98, 101, 102, 114, 87, 99, 100, 103, 115, 88, 100, 103, 104, 116, 89, 101, 102, 105, 117, 90, 102, 105, 106, 118, 91, 103, 104, 107, 119, 92, 104, 107, 108, 120, 93, 105, 106, 109, 121, 94, 106, 109, 122, 95, 107, 108, 123, 96, 111, 112, 124, 97, 110, 194, 113, 125, 98, 110, 113, 114, 126, 99, 111, 112, 195, 115, 127, 100, 112, 115, 116, 128, 101, 113, 114, 196, 117, 129, 102, 114, 117, 118, 130, 103, 115, 116, 197, 119, 131, 104, 116, 119, 120, 132, 105, 117, 118, 198, 121, 133, 106, 118, 121, 122, 134, 107, 119, 120, 199, 123, 135, 108, 120, 123, 136, 109, 121, 122, 200, 137, 110, 125, 126, 111, 124, 127, 112, 124, 127, 128, 113, 125, 126, 129, 114, 126, 129, 130, 115, 127, 128, 131, 116, 128, 131, 132, 117, 129, 130, 133, 118, 130, 133, 134, 119, 131, 132, 135, 120, 132, 135, 136, 121, 133, 134, 137, 122, 134, 137, 123, 135, 136, 83, 139, 145, 85, 138, 140, 146, 87, 139, 141, 147, 89, 140, 142, 148, 91, 141, 143, 149, 93, 142, 144, 150, 95, 143, 151, 138, 146, 152, 139, 145, 147, 153, 140, 146, 148, 154, 141, 147, 149, 155, 142, 148, 150, 156, 143, 149, 151, 157, 144, 150, 158, 145, 153, 159, 146, 152, 154, 160, 147, 153, 155, 161, 148, 154, 156, 162, 149, 155, 157, 163, 150, 156, 158, 164, 151, 157, 165, 152, 160, 166, 153, 159, 161, 167, 154, 160, 162, 168, 155, 161, 163, 169, 156, 162, 164, 170, 157, 163, 165, 171, 158, 164, 172, 159, 167, 173, 160, 166, 168, 174, 161, 167, 169, 175, 162, 168, 170, 176, 163, 169, 171, 177, 164, 170, 172, 178, 165, 171, 179, 166, 174, 180, 167, 173, 175, 181, 168, 174, 176, 182, 169, 175, 177, 183, 170, 176, 178, 184, 171, 177, 179, 185, 172, 178, 186, 173, 181, 187, 174, 180, 182, 188, 175, 181, 183, 189, 176, 182, 184, 190, 177, 183, 185, 191, 178, 184, 186, 192, 179, 185, 193, 180, 188, 194, 181, 187, 189, 195, 182, 188, 190, 196, 183, 189, 191, 197, 184, 190, 192, 198, 185, 191, 193, 199, 186, 192, 200, 111, 187, 195, 113, 188, 194, 196, 115, 189, 195, 197, 117, 190, 196, 198, 119, 191, 197, 199, 121, 192, 198, 200, 123, 193, 199, 202, 203, 211, 231, 207, 201, 204, 212, 232, 208, 201, 204, 229, 205, 237, 202, 203, 230, 206, 238, 203, 206, 237, 207, 213, 204, 205, 238, 208, 214, 201, 205, 208, 211, 215, 202, 206, 207, 212, 216, 210, 211, 233, 231, 219, 209, 212, 234, 232, 220, 201, 207, 209, 212, 217, 202, 208, 210, 211, 218, 205, 214, 247, 215, 221, 206, 213, 248, 216, 222, 207, 213, 216, 217, 223, 208, 214, 215, 218, 224, 211, 215, 218, 219, 227, 212, 216, 217, 220, 228, 209, 217, 220, 241, 225, 210, 218, 219, 242, 226, 78, 213, 222, 253, 223, 79, 214, 221, 254, 224, 66, 215, 221, 224, 227, 67, 216, 222, 223, 228, 44, 219, 226, 227, 249, 45, 220, 225, 228, 250, 56, 217, 223, 225, 228, 57, 218, 224, 226, 227, 203, 230, 231, 239, 235, 204, 229, 232, 240, 236, 201, 209, 229, 232, 233, 202, 210, 230, 231, 234, 209, 231, 234, 235, 241, 210, 232, 233, 236, 242, 229, 233, 236, 239, 243, 230, 234, 235, 240, 244, 203, 205, 238, 239, 247, 204, 206, 237, 240, 248, 229, 235, 237, 240, 245, 230, 236, 238, 239, 246, 219, 233, 242, 243, 249, 220, 234, 241, 244, 250, 235, 241, 244, 245, 251, 236, 242, 243, 246, 252, 239, 243, 246, 247, 255, 240, 244, 245, 248, 256, 213, 237, 245, 248, 253, 214, 238, 246, 247, 254, 42, 225, 241, 250, 251, 43, 226, 242, 249, 252, 54, 243, 249, 252, 255, 55, 244, 250, 251, 256, 76, 221, 247, 254, 255, 77, 222, 248, 253, 256, 64, 245, 251, 253, 256, 65, 246, 252, 254, 255, 259, 267, 279, 263, 260, 268, 280, 264, 257, 277, 261, 285, 258, 278, 262, 286, 259, 285, 263, 269, 260, 286, 264, 270, 257, 261, 267, 271, 258, 262, 268, 272, 267, 281, 279, 273, 268, 282, 280, 274, 257, 263, 265, 275, 258, 264, 266, 276, 38, 261, 270, 293, 271, 39, 262, 269, 294, 272, 26, 263, 269, 272, 275, 27, 264, 270, 271, 276, 4, 265, 274, 275, 289, 5, 266, 273, 276, 290, 16, 267, 271, 273, 276, 17, 268, 272, 274, 275, 259, 279, 287, 283, 260, 280, 288, 284, 257, 265, 277, 281, 258, 266, 278, 282, 265, 279, 283, 289, 266, 280, 284, 290, 277, 281, 287, 291, 278, 282, 288, 292, 259, 261, 287, 293, 260, 262, 288, 294, 277, 283, 285, 295, 278, 284, 286, 296, 2, 273, 281, 290, 291, 3, 274, 282, 289, 292, 14, 283, 289, 292, 295, 15, 284, 290, 291, 296, 36, 269, 285, 294, 295, 37, 270, 286, 293, 296, 24, 287, 291, 293, 296, 25, 288, 292, 294, 295};
std::vector<idx_t> weight = {254100, 326700, 23100, 29700, 23100, 29700, 231000, 297000, 13860, 17820, 13860, 17820, 565950, 727650, 23100, 29700, 21000, 27000, 1260, 1620, 1260, 1620, 51450, 66150, 23100, 29700, 21000, 27000, 1260, 1620, 1260, 1620, 51450, 66150, 254100, 326700, 23100, 29700, 23100, 29700, 231000, 297000, 13860, 17820, 13860, 17820, 565950, 727650, 13860, 17820, 1260, 1620, 1260, 1620, 12600, 16200, 30870, 39690, 13860, 17820, 1260, 1620, 1260, 1620, 12600, 16200, 30870, 39690, 127050, 163350, 11550, 14850, 11550, 14850, 115500, 148500, 6930, 8910, 6930, 8910, 282975, 363825, 57750, 74250, 5250, 6750, 5250, 6750, 52500, 67500, 3150, 4050, 3150, 4050, 128625, 165375, 196350, 252450, 17850, 22950, 17850, 22950, 178500, 229500, 10710, 13770, 10710, 13770, 437325, 562275, 57750, 74250, 5250, 6750, 5250, 6750, 52500, 67500, 3150, 4050, 3150, 4050, 128625, 165375, 92400, 118800, 8400, 10800, 8400, 10800, 84000, 108000, 5040, 6480, 5040, 6480, 205800, 264600, 99000, 9000, 9000, 90000, 5400, 5400, 220500, 23760, 2160, 2160, 21600, 1296, 1296, 52920, 118800, 10800, 10800, 108000, 6480, 6480, 264600, 23760, 2160, 2160, 21600, 1296, 1296, 52920, 118800, 10800, 10800, 108000, 6480, 6480, 264600, 23760, 2160, 2160, 21600, 1296, 1296, 52920, 118800, 10800, 10800, 108000, 6480, 6480, 264600, 23760, 2160, 2160, 21600, 1296, 1296, 52920, 99000, 9000, 9000, 90000, 5400, 5400, 220500, 756, 972, 756, 972, 630, 810, 630, 810, 630, 810, 630, 810, 378, 486, 378, 486, 378, 486, 378, 486, 630, 810, 630, 810, 630, 810, 630, 810, 756, 972, 756, 972, 630, 810, 630, 810, 630, 810, 630, 810, 378, 486, 378, 486, 378, 486, 378, 486, 630, 810, 630, 810, 630, 810, 630, 810, 2100, 2700, 2100, 2700, 1260, 1620, 1260, 1620, 1260, 1620, 1260, 1620, 1260, 1620, 1260, 1620, 1260, 1620, 1260, 1620, 2100, 2700, 2100, 2700, 1260, 1620, 1260, 1620, 1260, 1620, 1260, 1620, 1260, 1620, 1260, 1620, 1260, 1620, 1260, 1620};
if (xadj.size() != nME + 1 || adjncy.size() != xadj.back())
throw std::runtime_error("xadj or adjncy tables are wrong.");
idx_t n = static_cast<idx_t>(nME);
idx_t ncon = 1;
idx_t nparts = 64;
std::vector<idx_t> part(n);
idx_t objval;
idx_t options[METIS_NOPTIONS];
METIS_SetDefaultOptions(options);
options[METIS_OPTION_OBJTYPE] = METIS_OBJTYPE_VOL;
options[METIS_OPTION_CTYPE] = METIS_CTYPE_SHEM;
options[METIS_OPTION_IPTYPE] = METIS_IPTYPE_GROW;
options[METIS_OPTION_RTYPE] = METIS_RTYPE_FM;
options[METIS_OPTION_NO2HOP] = 1;
options[METIS_OPTION_NCUTS] = 10;
options[METIS_OPTION_MINCONN] = 1;
options[METIS_OPTION_CONTIG] = 1;
options[METIS_OPTION_NUMBERING] = 0;
options[METIS_OPTION_UFACTOR] = 10; // Used to set a limit on imbalanceness between partitions
options[METIS_OPTION_SEED] = 1; // Set a seed for the randomness
options[METIS_OPTION_NITER] = 10; // Set a max number of iterations for metis to do its work
// options[METIS_OPTION_DBGLVL] = METIS_DBG_INFO; // Set the verbose level for debugging
int status = METIS_PartGraphKway(&n, &ncon, xadj.data(), adjncy.data(), weight.data(), NULL, NULL, &nparts, NULL, NULL, options, &objval, part.data());
if (status != METIS_OK) {
switch (status)
{
case METIS_ERROR_INPUT:
throw std::runtime_error("Error: METIS's input error.");
break;
case METIS_ERROR_MEMORY:
throw std::runtime_error("Error: METIS memory error.");
break;
default:
throw std::runtime_error("Error: METIS error.");
break;
}
}
return 0;
}
I was expecting METIS to be able to partition my graph into something like 41 partitions with 5 nodes on it and 64-41=23 partitions with 4 nodes on it (since 297 = 5*41 + 4 * (65-41)). But instead, I see several messages in my terminal saying :
***Cannot bisect a graph with 0 vertices!
***You are trying to partition a graph into too many parts!
So I’ve looked into METIS, and here is how it leads to this : (line number could be approximative, watch out)
kmetis.c:METIS_PartGraphKway(…) (l.74) -> kmetis.c:MlevelKWayPartitioning(…) (l.124)-> kmetis.c:InitKWayPartitioning() (l.200)-> pmetis.c:METIS_PartGraphRecursive() (l.133)-> pmetis.c:MlevelRecursiveBisection()
The last function is recursive and has this block of code at the beginning :
if ((nvtxs = graph->nvtxs) == 0) {
printf("t***Cannot bisect a graph with 0 vertices!n"
"t***You are trying to partition a graph into too many parts!n");
return 0;
}
So apparently, it arrives here with this variable nvtxs
, which is the number of nodes in a graph, set to 0, and this causes this message to be printed.
But first, it does not lead METIS to a crash, and not even return a error status from METIS, the function returns 0, and I end up with no error from METIS but an incomplete and/or invalid partition of my graph… Typically, some partitions have 0 nodes !
It feels like METIS is kind of telling me : “you’re partitioning a too small graph into too many parts. I deals with bigger graph and my developers excluded me of dealing with such small graphs.”
So I guess my question is how should I do differently to have METIS partition my graph ? I’ve tried settings the options parameter differently to avoid doing a recursive bisection of my graph, but no success.
And according to this paper from the METIS people : “The k-way partitioning problem is most frequently solved by recursive bisection. … There are a number of advantages of computing the k-way partitioning directly (rather than computing it successively via recursive bisection). First, the entire graph now needs to be coarsened only once, reducing the complexity of this phase to O(|E|) down from O(|E| log k). Second, it is well known that recursive bisection can do arbitrarily worse than k-way partitioning [27]. Thus, a method that obtains a k-way partitioning directly can potentially produce much better partitionings. … In our algorithm, the k-way partitioning of Gm is computed using our multilevel bisection algorithm [16]. Our experience has shown that our multilevel recursive bisection algorithm produces good initial partitionings and requires a relatively small amount of time as long as the size of the original graph is sufficiently larger than k.”
It seems that way indeed…