Code used in the paper “Ranks of matrices with few distinct entries”
The paper constructs $n$-letter words over $k$-letter alphabet for which the largest number of occurrences of any subsequences is $\approx k^{n/4k}$. The construction is based on $8$ sign vectors $$ \begin{align*} u^{(1)}&=(+,+,+,+,+,+,+,+),\\ u^{(2)}&=(-,-,-,+,-,+,-,-),\\ u^{(3)}&=(+,+,+,-,-,-,-,+),\\ u^{(4)}&=(-,+,-,-,+,+,+,-),\\ u^{(5)}&=(+,-,-,+,-,-,+,+),\\ u^{(6)}&=(-,+,+,+,+,-,-,-),\\ u^{(7)}&=(+,-,-,-,+,+,-,+),\\ u^{(8)}&=(-,-,+,-,-,-,+,-). \end{align*} $$ It is then asserted that these vectors satisfy the following eight properties.
- $u^{(i)},u^{(i+1)}$ agree on at most two coordinates.
- $u^{(i)},u^{(j)}$ agree on at most four coordinates if $u^{(i)}\neq u^{(j)}$.
- $u^{(i)},u^{(i+1)},u^{(i+2)}$ do not agree on any coordinate.
- $u^{(i)},u^{(i+1)},u^{(j)}$ agree on at most one coordinate if $u^{(j)}\notin \{u^{(i)},u^{(i+1)}\}$.
- $u^{(i)},u^{(j)},u^{(\ell)}$ agree on at most two coordinates if $u^{(i)}, u^{(j)}, u^{(\ell)}$ are distinct.
- $u^{(i)}_{\geq 7}\neq u^{(i+j)}_{\geq 7}$ for $j\in [3].$
- $u^{(i)}_{\geq 6}\neq u^{(i+j)}_{\geq 6}$ for $j\in [7]$.
- Vectors $u^{(i)}_{\geq 4},u^{(i+j)}_{\geq 4}$ agree on at most three coordinates for $j\in [7]$.
U = {{1, 1, 1, 1, 1, 1, 1, 1}, {0, 0, 0, 1, 0, 1, 0, 0}, {1, 1, 1, 0, 0, 0, 0, 1}, {0, 1, 0, 0, 1, 1, 1, 0}, {1, 0, 0, 1, 0, 0, 1, 1}, {0, 1, 1, 1, 1, 0, 0, 0}, {1, 0, 0, 0, 1, 1, 0, 1}, {0, 0, 1, 0, 0, 0, 1, 0}}; Agree[L_List] := Table[CountDistinct[L[[All, i]]] == 1, {i, 1, Length[L[[1]]]}]; CountAgree[L_List] := Count[Agree[L], True]; (* Prop (a) *) Print[And @@ Table[CountAgree[U[[{i, Mod[i, 8] + 1}]]] <= 2, {i, 8}]]; (* Prop (b) *) Print[And @@ Flatten[Table[(i == j) || CountAgree[U[[{i, j}]]] <= 4, {i, 1, 8}, {j, 1, 8}]]]; (* Prop (c) *) Print[And @@ Table[CountAgree[U[[{i, Mod[i, 8] + 1, Mod[i + 1, 8] + 1}]]] == 0, {i, 1, 8}]]; (* Prop (d) *) Print[And @@ Flatten[Table[(i == j) || (j == Mod[i, 8] + 1) || CountAgree[U[[{i, Mod[i, 8] + 1, j}]]] <= 1, {i, 1, 8}, {j, 1, 8}]]]; (* Prop (e) *) Print[And @@ Flatten[Table[(i == j) || (i == l) || (j == l) || CountAgree[U[[{i, j, l}]]] <= 2, {i, 1, 8}, {j, 1, 8}, {l, 1, 8}]]]; (* Prop (f) *) Print[And @@ Flatten[Table[ U[[i, {7, 8}]] != U[[Mod[i + j - 1, 8] + 1, {7, 8}]], {i, 1, 8}, {j, 1, 3}]]]; (* Prop (g) *) Print[And @@ Flatten[Table[ U[[i, {6, 7, 8}]] != U[[Mod[i + j - 1, 8] + 1, {6, 7, 8}]], {i, 1, 8}, {j, 1, 7}]]]; (* Prop (h) *) Print[And @@ Flatten[Table[ CountAgree[U[[{i, Mod[i + j - 1, 8] + 1}, {4, 5, 6, 7, 8}]]] <= 3, {i, 1, 8}, {j, 1, 7}]]];As a bonus, here is a code that finds the most frequent subsequences in a given word (abbracadabra in this example):
(* Compute M(v,w) when the inputs are given as lists of symbols *) CountCopies[v_List, w_List] := CountCopies[v, w] = Which[ Length[v] == 0, 1, Length[w] == 0, 0, v[[1]] != w[[1]], CountCopies[v, Drop[w, 1]], True, CountCopies[v, Drop[w, 1]] + CountCopies[Drop[v, 1], Drop[w, 1]]]; (* Same but for strings *) CountCopies[v_String, w_String] := CountCopies[Characters[v], Characters[w]]; (* Report all subsequences of w, paired with their occurrence frequency, sorted *) MaxCopies[w_String] := Module[{L = Characters[w]}, Sort[Tally[(L[[#]] &) /@ Subsets[Range[Length[L]]]], (#1[[2]] < #2[[2]] &)]]; Print[MaxCopies["abbracadabra"][[{-1, -2, -3}]]]