It is almost impossible for me to read contemporary mathematicians who, instead of saying “Petya washed his hands,” write simply: “There is a \(t_1 < 0\)t 1 < 0 such that the image of \(t_1\)t 1 under the natural mapping \(t_1 \mapsto \mathit{Petya}(t_1)\)t 1 Petya(t 1) belongs to the set of dirty hands, and a \(t_2, t_1 < t_2 \le 0\)t 2, t 1 < t_2 0, such that the image of \(t_2\)t 2 under the above-mentioned mapping belongs to the complement of the set defined in the preceding sentence.” — Vladimir Igorevich Arnol’d (Zdravkovska and Arnol’d 1987, 30)
Gröbner bases can be used for solving systems of polynomial equations. A system of multivariate equations can be seen as an ideal basis and a Gröbner basis is a particular basis that yields the same affine variety but has a structure amenable to elimination of variables.
Theorem n4. (Hilbert Basis Theorem.) For every ideal \(I \subseteq \mathbb{F}[\mathbf{x}]\)I F[x], there exist \(g_1, \ldots, g_m \in I\)g 1, , g m I such that \(I = \langle g_1, \ldots, g_m \rangle\)I = g 1, , g m (Cox 2015).
Definition n5. Let \(I \subseteq \mathbb{F}[\mathbf{x}]\)I F[x] be an ideal different from \(\{0\}\)\0\. We denote by \(\mathrm{LT}(I) = \{\mathrm{LT}(f) \mid f \in I\}\)LT(I) = \\mathrmLT}(f) \mid f \in I\} the set of leading terms of nonzero elements of \(I\)I. The ideal of leading terms of \(I\)I, generated by \(\mathrm{LT}(I)\)LT(I), is denoted \(\langle \mathrm{LT}(I) \rangle\) LT(I) .
Definition nh. Fix a monomial order on \(\mathcal{M}\)M. A finite basis \(G \subseteq I\)G I of a nonzero ideal \(I \subseteq \mathbb{F}[\mathbf{x}]\)I F[x] is a Gröbner basis if \[\langle \mathrm{LT}(G) \rangle = \langle \mathrm{LT}(I) \rangle.\]
A set \(\{g_1, \ldots, g_m\} \subseteq I\)\g 1, \ldots, g m\ \subseteq I is a Gröbner basis if and only if the leading term of any element of \(I\)I is divisible by some \(\mathrm{LT}(g_i)\)LT(g i).
Proposition n7. Fix a monomial order on \(\mathcal{M}\)M. Then every ideal \(I \subseteq \mathbb{F}[\mathbf{x}]\)I F[x] has a Gröbner basis (Cox 2015).
Definition n6. Let \(G \subseteq I\)G I be a Gröbner basis of \(I\)I. We call \(G\)G a reduced Gröbner basis if for all \(g \in G\)g G:
- \(\mathrm{LC}(g) = 1\)LC(g) = 1.
- No monomial of \(g\)g is in \(\langle \mathrm{LT}(G \setminus \{g\}) \rangle\) LT(G \g\) \rangle.
Most computer algebra systems compute reduced Gröbner bases by default. Any ideal has a unique reduced Gröbner basis. Given \(g \in G\)g G, let \(g’ = \overline{g}^{G \setminus \{g\}}\)g’ = g G \g\}} and \(G’ = (G \setminus \{g\}) \cup \{g’\}\)G’ = (G \g\) \cup \g’\. If we keep applying this process until all elements are fully reduced, we obtain the reduced Gröbner basis.
Definition n9. Let \(I = \langle f_1, \ldots, f_m \rangle \subseteq \mathbb{F}[x_1, \ldots, x_n]\)I = f 1, , f m \subseteq \mathbbF[x_1, \ldots, x_n] be an ideal. The \(l\)l-th elimination ideal \(I_l\)I l is the ideal of \(\mathbb{F}[x_{l+1}, \ldots, x_n]\)F[x l+1, , x n] given by \(I_l = I \cap \mathbb{F}[x_{l+1}, \ldots, x_n]\)I l = I F[x l+1, , x_n].
Theorem dy. (Elimination Theorem.) Let \(G \subseteq I \subseteq \mathbb{F}[x_1, \ldots, x_n]\)G I F[x 1, \ldots, x n] be a Gröbner basis of \(I\)I under the lexicographic order \(x_1 \succeq_{\mathrm{lex}} x_2 \succeq_{\mathrm{lex}} \cdots \succeq_{\mathrm{lex}} x_n\)x 1 lex x_2 _{\mathrm{lex}} \cdots \succeq_{\mathrm{lex}} x_n. Then, for every \(0 \le l < n\)0 l < n, the set \(G_l = G \cap \mathbb{F}[x_{l+1}, \ldots, x_n]\)G l = G F[x l+1, , x_n] is a Gröbner basis of the \(l\)l-th elimination ideal \(I_l\)I l.
Proof. We know that \(G_l \subseteq I_l\)G l I l by construction, so we only need to show that \(\langle \mathrm{LT}(I_l) \rangle = \langle \mathrm{LT}(G_l) \rangle\) LT(I l) = \langle \mathrmLT(G l) \rangle for a fixed \(l\)l between \(0\)0 and \(n\)n. The inclusion \(\langle \mathrm{LT}(I_l) \rangle \supseteq \langle \mathrm{LT}(G_l) \rangle\) LT(I l) \supseteq \langle \mathrmLT(G l) \rangle is evident. To prove \(\langle \mathrm{LT}(I_l) \rangle \subseteq \langle \mathrm{LT}(G_l) \rangle\) LT(I l) \subseteq \langle \mathrmLT(G l) \rangle, we need to show that \(\mathrm{LT}(g) \mid \mathrm{LT}(f)\)LT(g) LT(f) for an arbitrary \(f \in I_l\)f I l and some \(g \in G_l\)g G l. We know that \(f\)f is also in \(I\)I, so \(\mathrm{LT}(g) \mid \mathrm{LT}(f)\)LT(g) LT(f) for some \(g \in G\)g G since \(G\)G is a Gröbner basis of \(I\)I. Since \(f \in I_l\)f I l, \(\mathrm{LT}(g)\)LT(g) must consist only of \(x_{l+1}, \ldots, x_n\)x l+1, , x n. The crucial observation is: since \(x_1 \succeq_{\mathrm{lex}} \cdots \succeq_{\mathrm{lex}} x_n\)x 1 lex \succeq_{\mathrm{lex}} x_n, any monomial involving any \(x_1, \ldots, x_l\)x 1, , x l is greater than all monomials in \(\mathbb{F}[x_{l+1}, \ldots, x_n]\)F[x l+1, , x n]. Therefore \(g \in G_l\)g G l.
Example db. Consider a system of equations where \(f_1 = f_2 = f_3 = 0\)f 1 = f 2 = f_3 = 0 are polynomials in \(\mathbb{R}[x, y, z]\)R[x, y, z] with \[f_1 = -16x^2 - 4xy^2 + 4xz, \quad f_2 = 4x^2 z + 2xy^2 z + z, \quad f_3 = xy^2 + 2y^2 z + 1.\] Computing the reduced Gröbner basis with \(z \succeq_{\mathrm{lex}} y \succeq_{\mathrm{lex}} x\)z lex y {\mathrm{lex}} x gives \[g_1 = x + \tfrac{1}{4}y^2 - \tfrac{1}{4}z, \quad g_2 = y^4 - z^2 - 4, \quad g_3 = y^2 z - \tfrac{1}{9}z^2, \quad g_4 = z^3 + \tfrac{81}{20}z.\] The last polynomial involves only \(z\)z and has the only real solution \(z = 0\)z = 0. Substituting into \(g_3\)g 3 and \(g_2\)g 2 yields \(y = \pm\sqrt{2}\)y = 2, and then \(x = -\frac{1}{2}\)x = -12. The reduced Gröbner basis allows solving the system by successive elimination of variables.
Definition dn. Let \(\mathbb{F}_q[x_1, \ldots, x_n]\)F q[x 1, , x_n] be a polynomial ring over the finite field \(\mathbb{F}_q\)F q with order \(q = p^m\)q = p m where \(p\)p is prime and \(m \in \mathbb{N}_{>0}\)m N >0. The field equations of \(\mathbb{F}_q\)F q are the polynomials \(x_i^q - x_i\)x i q - x_i for every \(x_i \in \{x_1, \ldots, x_n\}\)x i \x 1, \ldots, x_n\.
Adding the field equations into a polynomial system over \(\mathbb{F}_q\)F q restricts the solutions to \(\mathbb{F}_q\)F q and ensures the system has finitely many solutions.
Theorem dd. (Finiteness Theorem.) Let \(f_1, \ldots, f_m \in \mathbb{F}[x_1, \ldots, x_n]\)f 1, , f m F[x_1, \ldots, x_n] be polynomials. If \(\langle f_1, \ldots, f_m \rangle \cap \mathbb{F}[x_i] \neq 0\) f 1, , f m \cap \mathbbF[x_i] \neq 0 for all \(x_i\)x i, then the variety \(V(\langle f_1, \ldots, f_m \rangle) \subseteq \mathbb{F}^n\)V( f 1, , f m ) \subseteq \mathbbF^n is finite (Cox 2015).
Theorem dr. (Hilbert’s Weak Nullstellensatz.) Let \(f_1, \ldots, f_m \in \mathbb{F}[\mathbf{x}]\)f 1, , f m F[\mathbfx] be polynomials. Then the following are equivalent:
- There exists an extension field \(\mathbb{E}\)E of \(\mathbb{F}\)F and \(\mathbf{a} \in \mathbb{E}^n\)a E n such that \(f_i(\mathbf{a}) = 0\)f i(a) = 0 for all \(f_i\)f i.
- \(1 \notin \langle f_1, \ldots, f_m \rangle\)1 f 1, , f m \rangle (Becker 1993).
Whenever \(1\)1 belongs to an ideal \(I\)I, we have \(I = \mathbb{F}[\mathbf{x}]\)I = F[x], as shown in the proof of Proposition p. Therefore, if the reduced Gröbner basis consists of the single element \(1\)1, the polynomial system has no solutions.
It can be shown that if we have a finite field \(\mathbb{F}\)F and its extension \(\mathbb{E}\)E, all elements from \(\mathbb{F}\)F satisfy the field equations of \(\mathbb{F}\)F and no element in \(\mathbb{E} \setminus \mathbb{F}\)E F satisfies any of them. Therefore, adding the field equations into a polynomial system restricts solutions to \(\mathbb{F}\)F.
Example di. Consider a system of equations where \(f_1 = f_2 = f_3 = 0\)f 1 = f 2 = f_3 = 0 are polynomials in \(\mathrm{GF}(2)\)GF(2) with \[f_1 = x + y + z, \quad f_2 = xy + xz + yz, \quad f_3 = xyz + 1.\] Computing the reduced Gröbner basis with \(z \succeq_{\mathrm{lex}} y \succeq_{\mathrm{lex}} x\)z lex y {\mathrm{lex}} x gives \[g_1 = x + y + z, \quad g_2 = y^2 + yz + z^2, \quad g_3 = z^3 + 1.\] The only solution to the last polynomial is \(z = 1\)z = 1. Substituting into \(g_2\)g 2 gives \(g_2’ = y^2 + y + 1\)g 2’ = y 2 + y + 1, which has no solution in \(\mathrm{GF}(2)\)GF(2), so the system has no solution in \(\mathrm{GF}(2)\)GF(2). Note that \(g_2’\)g 2’ is also irreducible over \(\mathrm{GF}(2)\)GF(2). By Proposition na, let \(\alpha\) be a root of \(g_2’\)g 2’, i.e., the coset \(y + \langle g_2’ \rangle\)y + g 2’ in \(\mathbb{E} = \mathrm{GF}(2)[y] / \langle g_2’ \rangle\)E = GF(2)[y] / g 2’ \rangle. We get the finite field \(\mathrm{GF}(2)(\alpha) \cong \mathrm{GF}(2^2)\)GF(2)() \mathrmGF(2 2) with elements \(0, 1, \alpha, \alpha + 1\)0, 1, , + 1. The polynomial \(g_2\)g 2 has two solutions in \(\mathrm{GF}(2^2)\)GF(2 2), namely \(y = \alpha\)y = and \(y = \alpha + 1\)y = + 1, and these lead to further solutions of the original system.
Example ds. Considering Example di, if we add the field equations of \(\mathrm{GF}(2)\)GF(2) into the system, the reduced Gröbner basis is \(g_1 = 1\)g 1 = 1. By the Hilbert’s Weak Nullstellensatz, the system has no solutions in \(\mathrm{GF}(2)\)GF(2).
Computing Gröbner Bases
Definition df. Let \(f, g \neq 0 \in \mathbb{F}[\mathbf{x}]\)f, g 0 F[\mathbfx] be polynomials. If \(\mathrm{multideg}(f) = \alpha\)multideg(f) = and \(\mathrm{multideg}(g) = \beta\)multideg(g) = , let \(\gamma = (\gamma_1, \ldots, \gamma_n)\) = ( 1, , \gamma n) where \(\gamma_i = \max(\alpha_i, \beta_i)\) i = ( i, \beta_i) for each \(i\)i. We call \(x^\gamma\)x the least common multiple of \(\mathrm{LM}(f)\)LM(f) and \(\mathrm{LM}(g)\)LM(g), written \(x^\gamma = \mathrm{lcm}(\mathrm{LM}(f), \mathrm{LM}(g))\)x = lcm(LM(f), \mathrm{LM}(g)). The S-polynomial of \(f\)f and \(g\)g is \[S(f, g) = \frac{x^\gamma}{\mathrm{LT}(f)} \cdot f - \frac{x^\gamma}{\mathrm{LT}(g)} \cdot g.\]
Example dg. Let \(f = x^2 y^3 + x\)f = x 2 y 3 + x and \(g = 2xy^5 + y\)g = 2xy 5 + y be polynomials in \(\mathbb{R}[x, y]\)R[x, y] under \(\succeq_{\mathrm{grlex}}\) grlex. The least common multiple is \(x^2 y^5\)x 2 y 5 and \[S(f, g) = \frac{x^2 y^5}{x^2 y^3}(x^2 y^3 + x) - \frac{x^2 y^5}{2xy^5}(2xy^5 + y) = xy^2 - \tfrac{1}{2}xy.\] The S-polynomial is designed so that the leading terms cancel.
Theorem d8. (Buchberger’s Criterion.) A basis \(G = \{g_1, \ldots, g_m\}\)G = \g 1, \ldots, g m\ of an ideal \(I\)I is a Gröbner basis if and only if \(\overline{S(g_i, g_j)}^G = 0\)S(g i, g j)^G = 0 for all \(i \neq j\)i j (Buchberger 2006).
Buchberger’s Criterion, introduced in (Buchberger 2006), allows testing whether a given basis is a Gröbner basis and naturally leads to an algorithm that constructs a Gröbner basis by adding nonzero remainders \(\overline{S(g_i, g_j)}^G\)S(g i, g j)^G to \(G\)G until the criterion holds.
Definition de. Let \(\succeq\) be a monomial order on \(\mathcal{M}\)M and let \(G = \{g_1, \ldots, g_m\} \subseteq \mathbb{F}[\mathbf{x}]\)G = \g 1, \ldots, g m\ \subseteq \mathbbF[\mathbf{x}] be a set of polynomials. For any \(f \in \mathbb{F}[\mathbf{x}]\)f F[x], we say that \(f\)f reduces to zero modulo \(G\)G, written \(f \to_G 0\)f G 0, if \(f\)f has a standard representation \[f = \sum_{i=1}^{m} h_i g_i, \quad h_i \in \mathbb{F}[\mathbf{x}],\] which means that whenever \(h_i g_i \neq 0\)h i g i 0, we have \(\mathrm{LM}(f) \succeq \mathrm{LM}(h_i g_i)\)LM(f) LM(h i g i).
Note that \(\overline{f}^G = 0\)f G = 0 implies \(f \to_G 0\)f G 0, but the converse does not hold.
Proposition dz. A basis \(G = \{g_1, \ldots, g_m\}\)G = \g 1, \ldots, g m\ of an ideal \(I\)I is a Gröbner basis if and only if \(S(g_i, g_j) \to_G 0\)S(g i, g j) _G 0 for all \(i \neq j\)i j (Cox 2015).
This is a more general version of Buchberger’s Criterion. Buchberger’s Criterion naturally leads to the Buchberger’s algorithm, which constructs a Gröbner basis by adding nonzero remainders \(\overline{S(g_i, g_j)}^G\)S(g i, g j)^G to \(G\)G until the criterion holds. The F4 algorithm (Faugère 1999), introduced by Jean-Charles Faugère, uses the generalized criterion of Proposition dz and computes standard representations via matrix reduction rather than polynomial division. It is tailored for the graded reverse lexicographic order and the result can be converted to the lexicographic order using the FGLM algorithm (Faugère et al. 1993) for applying the Elimination Theorem.
Remark da. A tuple \(F\)F of polynomials can be represented by a matrix \(A\)A and a vector \(v\)v as follows. Consider the set \(M_\succeq(F) = \{m_n, \ldots, m_0\}\)M (F) = \m n, \ldots, m_0\ of all monomials in \(F\)F under the monomial order \(\succeq\). Let \(v = (m_n, \ldots, m_0)^T\)v = (m n, , m 0)^T be the vector of all monomials in decreasing order, and let \(A[i, j]\)A[i, j] contain the coefficient of \(m_j\)m j in \(f_i\)f i. The rows of \(Av\)Av then represent the original polynomials in \(F\)F. We can think of \(v\)v as being implicit and consider only the matrix \(A\)A, whose rows represent the polynomials in \(F\)F.
The F4 algorithm uses a set \(B\)B of unordered pairs of polynomials for which the corresponding S-polynomials are not known to reduce to zero. The degree of a pair \(\{i, j\}\)\i, j\ is \(\deg(\{i, j\}) = \deg(\mathrm{lcm}(\mathrm{LM}(f_i), \mathrm{LM}(f_j)))\)(\i, j\) = \deg(\mathrmlcm(\mathrm{LM}(f i), \mathrm{LM}(f j))). In each iteration, the algorithm selects all pairs with the minimal degree (the normal selection strategy). For each selected pair \(\{i, j\}\)\i, j\, the set \(L\)L contains the two halves \[\frac{\mathrm{lcm}(\mathrm{LM}(f_i), \mathrm{LM}(f_j))}{\mathrm{LT}(f_i)} \cdot f_i \quad \text{and} \quad \frac{\mathrm{lcm}(\mathrm{LM}(f_i), \mathrm{LM}(f_j))}{\mathrm{LT}(f_j)} \cdot f_j\] of the S-polynomial \(S(f_i, f_j)\)S(f i, f j). The sets \(L\)L and \(G\)G are then passed to SymbolicPreprocessing, which returns a matrix \(M\)M representing a set \(H\)H as in Remark da. The row reduced echelon form of \(M\)M gives equations of the form \[\overline{S(f_i, f_j)} = S(f_i, f_j) - c_1 x^{\alpha(1)} f_{k_1} - c_2 x^{\alpha(2)} f_{k_2} - \cdots.\] When \(\overline{S(f_i, f_j)}\)S(f i, f j) is included in \(G\)G, we get a standard representation of \(S(f_i, f_j)\)S(f i, f j) so that Proposition dz holds. On line 12, we pick those polynomials (rows) whose leading terms are not divisible by the leading terms of any of the polynomials in \(M\)M. These new polynomials correspond to \(\overline{S(f_i, f_j)}\)S(f i, f j) and are included in \(G\)G.
Algorithm d3. F4 algorithm (Faugère 1999).
Input: a tuple of polynomials \(F = (f_1, \ldots, f_m)\)F = (f 1, , f m).
Output: a Gröbner basis \(G\)G of
\(I = \langle f_1, \ldots, f_m \rangle\)I = f 1, , f m .
- \(G \gets F\)G F, \(m \gets |F|\)m |F|, \(B \gets \{\{i, j\} \mid 1 \le i < j \le m\}\)B \\i, j\ \mid 1 \le i < j \le m\.
- while \(B \neq \emptyset\)B :
- \(d \gets \min\{\deg(\{i, j\}) \mid \{i, j\} \in B\}\)d \\deg(\i, j\) \mid \i, j\} \in B\}.
- \(B’ \gets \{\{i, j\} \in B \mid \deg(\{i, j\}) = d\}\)B’ \\i, j\ \in B \mid \deg(\i, j\}) = d\}, \(B \gets B \setminus B’\)B B B’.
- \(L \gets \left\{\frac{\mathrm{lcm}(\mathrm{LM}(f_i), \mathrm{LM}(f_j))}{\mathrm{LT}(f_i)} \cdot f_i \ \middle|\ \{i, j\} \in B’\right\}\)L \\frac\mathrmlcm(\mathrm{LM}(f i), \mathrm{LM}(f j))}{\mathrm{LT}(f_i)} \cdot f_i \ \middle|\ \{i, j\} \in B’\right\}.
- \(M \gets \mathrm{SymbolicPreprocessing}(L, G)\)M SymbolicPreprocessing(L, G).
- \(N \gets\)N row reduced echelon form of \(M\)M.
- \(N^+ \gets \{n \in \mathrm{rows}(N) \mid \mathrm{LM}(n) \notin \langle \mathrm{LM}(\mathrm{rows}(M)) \rangle\}\)N + \n \in \mathrmrows(N) \mid \mathrmLM}(n) \notin \langle \mathrm{LM}(\mathrm{rows}(M)) \rangle\}.
- for \(n \in N^+\)n N +:
- \(m \gets m + 1\)m m + 1, \(f_m \gets\)f m polynomial form of \(n\)n.
- \(G \gets G \cup \{f_m\}\)G G \f m\, \(B \gets B \cup \{\{i, m\} \mid 1 \le i < m\}\)B B \\i, m\ \mid 1 \le i < m\.
- return \(G\)G.
The SymbolicPreprocessing function creates a matrix \(M\)M representing a set \(H\)H with two properties: (i) \(L \subseteq H\)L H, and (ii) whenever \(x^\beta\)x is a monomial in some \(f \in H\)f H and there exists some \(f_k \in G\)f k G with \(\mathrm{LM}(f_k) \mid x^\beta\)LM(f k) x , then \(H\)H contains a product \(x^\alpha f_k\)x f k with \(\mathrm{LM}(x^\alpha f_k) = x^\beta\)LM(x f k) = x^. The algorithm starts by including all polynomials from \(L\)L into \(H\)H, satisfying (i). It then processes all monomials in \(H\)H until none remain: for each unprocessed monomial \(x^\beta\)x , if there exists \(f_k \in G\)f k G with \(\mathrm{LM}(f_k) \mid x^\beta\)LM(f k) x , the product \(\frac{x^\beta}{\mathrm{LM}(f_k)} f_k\)x LM}(f k)} f_k is included in \(H\)H, satisfying (ii).
Algorithm d4. Symbolic Preprocessing.
Input: a set of polynomials \(L\)L, a set of polynomials \(G\)G.
Output: a matrix \(M\)M.
- \(H \gets L\)H L, \(\mathit{done} \gets \mathrm{LM}(H)\)done LM(H).
- while \(\mathit{done} \neq M_\succeq(H)\)done M (H):
- \(x^\beta \gets\)x the largest monomial in \(M_\succeq(H) \setminus \mathit{done}\)M (H) done.
- \(\mathit{done} \gets \mathit{done} \cup \{x^\beta\}\)done done \cup \{x \beta\}.
- if there exists \(f_k \in G\)f k G such that
\(\mathrm{LM}(f_k) \mid x^\beta\)LM(f k) x :
- Select any one such \(f_k\)f k.
- \(H \gets H \cup \left\{\frac{x^\beta}{\mathrm{LM}(f_k)} f_k\right\}\)H H \\fracx \beta\mathrm{LM}(f k)} f_k\right\}.
- return matrix of coefficients of \(H\)H with respect to \(M_\succeq(H)\)M (H) with columns in decreasing order according to \(\succeq\).
For a better intuitive insight, consider the matrix \(M\)M which represents the set \(H\)H. Recall that \(L \subseteq H\)L H, so \(M\)M contains both halves of \(S(f_i, f_j)\)S(f i, f j) for any \(\{i, j\}\)\i, j\ currently in \(B’\)B’. The S-polynomial \(S(f_i, f_j)\)S(f i, f j) is the difference of the two halves, so it can be represented as a linear combination of the rows of \(M\)M. The rows of \(N\)N form a basis for the vector space spanned by the rows of \(M\)M. This means that \(S(f_i, f_j)\)S(f i, f j) can also be represented by a linear combination of the rows of \(N\)N. It can be shown that this gives \(S(f_i, f_j) \to_G 0\)S(f i, f j) _G 0 when we include the new polynomials from \(N\)N into \(G\)G.