-
Notifications
You must be signed in to change notification settings - Fork 0
Gigel and Mafia is an algorithm oriented course homework exploiting graph representations of relationships between clans of Mafia families primarily through reductions to the Boolean Satisfiability Problem. Its implementation is conducted in Java.
sorinabuf/Reductions-to-SAT-Problem
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
******************************************************************************** Analysis of Algorithms - Homework 2 Gigel and Mafia Buf Sorina-Anamaria 321CA ******************************************************************************** The purpose of this homework is to manoeuvre an investigation on Mafia families, having as ultimate goal to abolish the relations between the clan members through multiple arrests. In order to do so, the current problem will be reduced to the SAT problem, along the different stages of the mission, thus streamlining the investigation process by communicating with the Antic Oracle. Generally, transforming the current problem to SAT will mean creating a boolean formula, that will later be evaluated by the Oracle, determining if there exists an interpretation that satisfies the given conjunctive normal form, similar to other famous graph problems (e.g. reducing graph k-coloring to SAT). ▶ Task 1 - SPY PLANTING The first task of Gigel consists in planting a number of spies in the Mafia families, so each family would get infiltrated, but two related clans could not have the same spy. In order to explain the logic behind the solution of this problem, in the demonstration that is to follow, we will consider the next variables: ◌ N - number of the Mafia families; ◌ M - number of the connections between the families (undirected relations); ◌ K - the maximum number of spies that may be planted. The reduction to SAT problem and consequently the boolean formula that the current problem is transformed to was implemented in the function: ◉ formulateOracleQuestion() The conjunctive normal form will be represented by a number of clauses and variables, component of the clauses formulated. The clausal normal form is a conjunction of one or more clauses, while a clause is a disjunction of literals, here represented by the variables that are to be declared. We will consider the following: ◌ number of variables = N * K, attributed as following: variable: 1 → family no.1 may be infiltrated by spy no.1 variable: 2 → family no.2 may be infiltrated by spy no.1 variable: 3 → family no.3 may be infiltrated by spy no.1 . . . variable: N → family no.N may be infiltrated by spy no.1 variable: N + 1 → family no.1 may be infiltrated by spy no.2 . . . variable: N * (K - 1) + 1 → family no.1 may be infiltrated by spy no.K variable: N * (K - 1) + 2 → family no.2 may be infiltrated by spy no.K variable: N * (K - 1) + 3 → family no.3 may be infiltrated by spy no.K . . . variable: N * (K - 1) + N → family no.N may be infiltrated by spy no.K ◌ number of clauses = N + M * K + N * K * (K - 1) / 2, thus explained: The first series of clauses is represented by the first "for loop" present in the current function, that displays the conditions that must be respected regarding the spies distribution amongst the families: each Mafia family must have at least one spy. Thus, there will be "N" clauses, one clause related to each family, formed of a number of "K" literals, represented by the corresponding variables that illustrated the fact that in each family may be planted one of the spies: clause no.1 → 1 ∨ (N + 1) ∨ . . . ∨ (N * (K - 1) + 1) clause no.2 → 2 ∨ (N + 2) ∨ . . . ∨ (N * (K - 1) + 2) . . . clause no.N → N ∨ (N + N) ∨ . . . ∨ (N * (K - 1) + N) ● Time complexity of constructing the first clauses is equivalent to parsing the first "for" of families and the second one (used for variable construction): O(N * K). The second series of clauses is represented by the second "for loop" block, illustrating the strictness over spy distribution between connected families: two related families can not have the same spy. Thus, there will be "M * K" clauses; each pair of families must have distinct spy, therefore the following clauses: . . . clause no.i → ¬family no.k with spy no.1 ∨ ¬family no.l with spy no.1 clause no.i+1 → ¬family no.k with spy no.2 ∨ ¬ family no.l with spy no.2 . . . ● Time complexity of constructing the second clauses is equivalent to parsing the first "for" of relations and the second one of spies: O(M * K). The third series of clauses is represented by the third loop block, responsible for creating the clauses regarding the distinction distribution of infiltrates over a family: a family must not have more than a spy. Thus, there will be "N * K * (K - 1) / 2" clauses, as each pair of non-repetitive, distinct spies can not coexist in a clan: clause no.1 → ¬1 ∨ ¬(N + 1) clause no.2 → ¬1 ∨ ¬(N * 2 + 1) . . . clause no.K-1 → ¬1 ∨ ¬(N * (K - 1) + 1) clause no.K → ¬(N + 1) ∨ ¬(N * 2 + 1) . . . ● Time complexity of constructing the third clauses is equivalent to parsing the first "for" of relations, the second one of spies and the third one of remaining spies: O(M * (1 + 2 + ... + (K - 1))) = O(N * K * (K - 1) / 2). ⇒ Final complexity of SAT reduction: O(N * K + M * K + N * K * (K - 1) / 2). The rest of the functions used to resolve the task are: ◉ readProblemData() → responsible for populating the main attributes of the problem and for storing the relations between the family in a O(M) complexity; ◉ decipherOracleAnswer() → extracts the answer of the oracle consisting in a list of size "N * K" of variables, positive (the ones marked as TRUE) and negative (the ones marked as FALSE), and converts the list into the problem's answer in a O(N * K) time complexity; ◉ writeAnswer() → writes the index of the spy assigned to each family (O(N)). ▶ Task 2 - EXTENDED FAMILIES INVESTIGATIONS The second task of Gigel consists in determining the most influential groups of families, represented by the extended ones, in which each two families must be related. In order to explain the logic behind the solution of this problem, in the demonstration that is to follow, we will consider the next variables: ◌ N - number of the Mafia families; ◌ M - number of the connections between the families (undirected relations); ◌ K - size of the extended searched family; ◌ X - number of non-edges = N * (N - 1) / 2 - M. The reduction to SAT problem and consequently the boolean formula that the current problem is transformed to was implemented in the function: ◉ formulateOracleQuestion() The conjunctive normal form will be represented by a number of clauses and variables, component of the clauses formulated, similar to the previous task. This algorithm can be seen as the generalized clique problem, where we want to find a clique of size "K" in the graph of family relations. ◌ number of variables = N * K, attributed as following: variable: 1 → family no.1 may be node no.1 in the clique variable: 2 → family no.2 may be node no.1 in the clique variable: 3 → family no.3 may be node no.1 in the clique . . . variable: N → family no.N may be node no.1 in the clique variable: N + 1 → family no.1 may be node no.2 in the clique . . . variable: N * (K - 1) + 1 → family no.1 may be node no.K in the clique variable: N * (K - 1) + 2 → family no.2 may be node no.K in the clique variable: N * (K - 1) + 3 → family no.3 may be node no.K in the clique . . . variable: N * (K - 1) + N → family no.N may be node no.K in the clique ◌ number of clauses = K + K * N * (N - 1) / 2 + X * K * (K - 1) + N * K * (K - 1) / 2, thus explained: The first series of clauses is represented by the first "for loop" present in the current function, that displays the conditions that must be respected regarding the family distribution amongst the clique: each node of the clique must be represented by a family. There will be "K" clauses, one clause related to each node, formed of a number of "N" literals, represented by the corresponding variables: clause no.1 → 1 ∨ 2 ∨ . . . ∨ N clause no.2 → (N + 1) ∨ (N + 2) ∨ . . . ∨ (N + N) . . . clause no.K → (N * (K - 1) + 1) ∨ . . . ∨ (N * (K - 1) + N) ● Time complexity of constructing the first clauses is equivalent to parsing the first "for" of nodes and the second one (used for variable construction): O(K * N). The second series of clauses is represented by the second loop block, responsible for creating the clauses regarding the distinction distribution of families over a node: a node must not be represented by more than a family. Thus, there will be "K * N * (N - 1) / 2" clauses, as each pair of non-repetitive, distinct families can not coexist as the same clique node: clause no.1 → ¬1 ∨ ¬2 clause no.2 → ¬1 ∨ ¬3 . . . clause no.N-1 → ¬1 ∨ ¬N clause no.N → ¬2 ∨ ¬3 . . . ● Time complexity of constructing the second clauses is equivalent to parsing the first "for" of nodes, the second one of families and the third one of remaining families: O(K * (1 + 2 + ... + (N - 1))) = O(K * N * (N - 1) / 2). The third series of clauses is represented by the third loop block, responsible for creating the clauses regarding the representation over a node, regarding the families: a family must not represent more than a clique node. Thus, there will be "N * K * (K - 1) / 2" clauses, as each pair of non-repetitive, distinct nodes can not coexist as having the same family: clause no.1 → ¬1 ∨ ¬(N + 1) clause no.2 → ¬1 ∨ ¬(N * 2 + 1) . . . clause no.K-1 → ¬1 ∨ ¬(N * (K - 1) + 1) clause no.K → ¬(N + 1) ∨ ¬(N * 2 + 1) . . . ● Time complexity of constructing the third clauses is equivalent to parsing the first "for" of families, the second one of nodes and the third one of remaining nodes: O(N * (1 + 2 + ... + (K - 1))) = O(N * K * (K - 1) / 2). The last series of clauses is defined in the final "for" segment, aiming to create clauses that illustrate the fact that two families that are unconnected can not coexist in the searched clique. Therefore, we write "X * K (K - 1)" clauses, as two families with no direct relation between them can not represent different nodes in the clique: . . . clause no.i → ¬family no.k as node no.1 ∨ ¬family no.l as node no.2 clause no.i+1 → ¬family no.k as node no.1 ∨ ¬family no.l as node spy no.3 . . . ● Time complexity of constructing the final clauses is equivalent to parsing the first two "for" of relations and the second ones of nodes: O(N^2 * K^2). ⇒ Final complexity of SAT reduction: O(K * N + K * N * (N - 1) / 2 + N * K * (K - 1) / 2 + N^2 * K^2). The rest of the functions used to resolve the task are: ◉ readProblemData() → responsible for populating the main attributes of the problem and for storing the relations between the family in a O(N + M + 1) complexity; ◉ decipherOracleAnswer() → extracts the answer of the oracle consisting in a list of size "N * K" of variables, positive (the ones marked as TRUE) and negative (the ones marked as FALSE), and converts the list into the problem's answer in a O(N * K) time complexity; ◉ writeAnswer() → writes the index of the family assigned to each node of the clique (O(K)). ▶ Task 3 - MAFIA ARRESTS Final task of Gigel consists in arresting the minimum number of families in order to disconnect all the clans and, consequently, destroy the Mafia grid. In order to explain the logic behind the solution of this problem, in the demonstration that is to follow, we will consider the next variables: ◌ N - number of the Mafia families; ◌ M - number of the connections between the families (undirected relations); ◌ K - size of the extended searched family; ◌ X - number of non-edges = N * (N - 1) / 2 - M. As a way to resolve this problem, we will use the previously implemented solution from task 2, requesting multiple answers from the Oracle. The main idea of the programme is to try finding the maximal clique in the complementary graph of relations between Mafia families, as the minimum number of arrests that need to be undertaken will be consisted of the families that are not found in the answer requested by the Antic Oracle. The utility functions used for resolving this task are: ◉ readProblemData() → responsible for populating the main attributes of the problem and for storing the relations between the families in the list of adjacency lists, while doing the same for the complementary graph, in a O(N + M + 1 + N^2) time complexity; ◉ reduceToTask2() → creates the input file of the Task 2 with the attributes of the current problem transferred to its complement; ◉ extractAnswerFromTask2() → extracts the answer given by the call of Task2, consisting in the maximal clique of size "K" present in the complementary graph; ◉ writeAnswer() → writes the minimal arrests needed to be performed. The main method of the programme, responsible for controlling the flow of multiple requests to the Oracle if defined as it follows: ◉ solve() In order to find the maximal clique in the complementary graph, by multiply requesting help from the Oracle, we need to set up the starting value of "K", considered as maximum, that is going to decrease in the formulated problem until a valid solution is discovered. Taking into account that, in a complete graph, the maximal amount of edges is "n * (n - 1) / 2" (where "n" is the number of nodes), we can set the superior limit of "K" by considering that all the edges given in the problem are part of the same complete subgraph: ⇒ K * (K - 1) = 2 * X ⇒ K^2 - K = 2 * X ⇒ (K - 0.5)^2 = 2 * X + 0.25 ⇒ K - 0.5 = sqrt(2 * X + 0.25) ⇒ K = 0.5 + sqrt(2 * X + 0.25) (final formula) Considering the complexity previously calculated for resolving task 2, we can agree that the complexity of this method will consist in the one already set at the second stage of the mission, multiplied by the difference between the optimized "K" discovered on the basis of the number of non-edges and the actual size of a maximal clique that is to be found. ▶ Task 4 - MAFIA ARRESTS (BONUS) In this special part of the mission, we are trying to reduce the questions addressed to the Oracle to a single one by transforming the input format to the Weighted Partial Max-SAT Input format. Therefore, the main functions will be identical to the Task 2, transferring the logic of the problem to this task, but applied to the complementary graph of relations, in the same way as we managed Task 3. Therefore, we can say that this task will consist in a combination between the second and the third one, modified in such a way as we get the correct answer through one Oracle call. In Task 2, we considered four types of clauses, now applied on the complementary problem (we consider the notations from previous tasks): ◌ "K" clauses representing the fact that a node of the clique must be represented by one family; ◌ "K * N * (N - 1) / 2" clauses corresponding to the fact that a node of a clique can not be represented by more than a family; ◌ "N * K * (K - 1) / 2" clauses corresponding to the fact that a family can not represent more than a node in the clique; ◌ "M * K * (K - 1)" clauses representing the fact that for every two families that form a relation, they can not both be present in the complementary clique. In the Weighted Partial Max-SAT Input format, it must be defined the weight value of hard clauses (the obligatory ones), and the weight of soft clauses (the optional ones), that will precede every clause in the input oracle file. In order to find the maximal possible clique, we will consider all the clauses as being hard, except the first set, that defined the existence of the respective nodes. Also, as the minimal size of a clique is considered as two, we will define two of the first clauses as obligatory to be fulfilled. I considered the soft clauses, with their respective weights, thus: clause no.3 → 1: (N * 2 + 1) ∨ (N * 2 + 2) ∨ . . . ∨ (N * 2 + N) clause no.4 → 2 : (N * 3 + 1) ∨ (N * 3 + 2) ∨ . . . ∨ (N * 3 + N) clause no.5 → 4 : (N * 4 + 1) ∨ (N * 4 + 2) ∨ . . . ∨ (N * 4 + N) . . . clause no.K → 2^(K-3): (N * (K - 1) + 1) ∨ . . . ∨ (N * (K - 1) + N) ⇒ hard clauses weight = sum of all the weights of soft clauses + 1 ⇒ hard clauses weight = 1 + 2 + 2^2 + ... + 2^(K - 3) + 1 ⇒ hard clauses weight = 2^(K - 2) - 1 + 1 ⇒ hard clauses weight = 2^(K - 2) Therefore, we created the respective clauses of this problem. A final aspect to be reminded is the chosen value of "K", as I tried to optimize the previous value set for the size of the possible maximal clique: ⇒ before: K = 0.5 + sqrt(2 * X + 0.25) A method to reduce the size of "K" was trying to find the maximal degree of a family in a complete graph by verifying the corresponding number of relations of their neighbours, which should be equal or greater than the initial family's. For more details related to effective implementation, verify the comments in the code. ********************************************************************************
About
Gigel and Mafia is an algorithm oriented course homework exploiting graph representations of relationships between clans of Mafia families primarily through reductions to the Boolean Satisfiability Problem. Its implementation is conducted in Java.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published