Who the asterisk protects
· 23 min read · updated · Hrönir rank #2/38
In some routine official gazette, in the header of a single-judge decision from the State Court of Accounts, you find this sentence:
INTERESTED PARTY: Mariana Esteves Carvalho Albuquerque. CPF no.
***.482.317-**.
The sentence is well composed. The full name, prepositions in place. The CPF chopped at both ends. Iperon, when granting the retirement, saw no reason to hide the name of the retiree; the Court of Accounts, when registering the act, saw no reason to change that choice; but both saw reason to hide two chunks of the CPF. The document publishes and conceals on the same line, with the serenity of well-trained civil service.
The scene repeats across hundreds of decisions. The Court of Accounts performs the summary review of retirement acts granted by the state pension institute and publishes the result in its own Official Gazette. Each decision carries the interested partyâs full name, position, posting, the articles of the Constitution and amendments on which the act is founded, and the CPF masked at both ends. Nobody read the whole page and asked: if the name is right here, what are the asterisks protecting?
The math nobody does
The Brazilian CPF has eleven digits. The first nine are, in principle, free; the last two are check digits, computed from the first nine by a predictable operation â modulo eleven, fixed in a Receita Federal regulation1. In other words: the last two add no information that isnât already contained in the first nine. They exist to detect typos, not to hide information.
When a CPF is masked in the form ***.XXX.XXX-**, five digits are hidden. The casual reader counts five asterisks and imagines five digits of uncertainty. Five decimal digits would mean a hundred thousand possibilities. A hundred thousand is a big number.
Itâs the wrong number.
The last two asterisks donât hide anything the others havenât already said. Given any nine-digit prefix, the two check digits are unique. That leaves the three asterisks at the start. Three decimal digits. A thousand possibilities.
To enumerate those thousand possibilities, all you need is a three-level for loop in any language with integer arithmetic. For each candidate triple, you compute the two check digits, complete the CPF, and youâre done: one valid CPF per candidate, a thousand candidates in total. The operation fits in fifteen lines of Python. It runs in microseconds.
The math is mathing. Five asterisks look like five digits. They are not.
The name is the front door
The previous exercise â generating a thousand candidates â is elegant and unnecessary. In almost every practical case, nobody needs to generate a thousand candidates, because the five asterisks live surrounded by information that already uniquely identifies the person.
Mariana Esteves Carvalho Albuquerque, whose name appears in the single-judge decision, is not just any Mariana. She is a retired state civil servant, with a defined position, a recorded posting, a numbered registration. The Transparency Portal publishes the full name, registration number, position, posting and salary of the entire payroll. The stateâs Electronic Official Gazette, searchable by full text across almost two decades of archive, carries the appointment ordinance, some promotion, some leave, the publication of the retirement act. Somewhere in those publications, over those twenty years, the CPF appeared in full. The LGPD became law in 2018; the rest of the servantâs documentary history is older, and was indexed.
The question the asterisk pretends to dodge is a question the asterisk has no way of dodging: who is this person. The act has already answered. The chopped CPF is a redundant confirmation of an identification already performed by the documentâs own header.
When the Brazilian system of performative protection feels especially diligent, it also anonymizes the registration number. Something like ****-1234 appears next to the chopped CPF. The operation is mathematically worse than publishing either of the two in full. Two partially masked identifiers cross by intersection: the set of candidates compatible with ***.482.317-** intersected with the set compatible with ****-1234 collapses, in most cases, to a single person, even without the name. The handbook that hides two fingers of the CPF and two fingers of the registration number is giving more information, not less.
Ahem ahem, IPERON đ€§.
It wasnât always this way. Sometime between 2018 and 2022, everyone in the Brazilian public service became convinced â by a combination of stray handbooks and fear of the legal office â that the chopped CPF was the formal mark of LGPD compliance. The chop was applied without touching the rest. The name stayed in full because removing the name would, then yes, contradict the purpose of the act. The CPF was the offering laid on the altar.
flowchart LR
A["Act in the Court Gazette<br/>full name<br/>partial CPF"] --> B["Transparency Portal"]
A --> C["Searchable Official Gazette"]
B --> D["registration, position, posting"]
C --> E["older publications<br/>(full CPF)"]
D --> F["unique identification"]
E --> F
Robson and Dona Maria
Robson is twenty-seven, an IT technician at a gas station on the BR-364 highway, and knows enough Python to solve small problems. He maintains the card terminals, configures the convenience storeâs Wi-Fi, updates the pumpâs system. He reads the act because his brother-in-law has just retired and heâs curious. The asterisks donât stop him because he doesnât even need to decipher them: he pastes the name into Google, finds the servant on the Transparency Portal, confirms it on the approved-candidates page of some old civil service exam, and in ten minutes he has the full picture. He used no tool that isnât free. He downloaded nothing. He ran no script. He just read â and the Brazilian system of official publications allows reading.
Dona Maria lives next to a civil servant who retired for permanent disability last year but still plays pickup soccer on Sundays. Sheâs a widow, has read newspapers her whole life, and sheâs suspicious. She looks up her neighborâs name in the Official Gazette, finds the single-judge decision, reads disability retirement, and sees the CPF chopped at the ends. She has no technical training. She doesnât know about the Transparency Portal. The asterisks paralyze her, not because they are insurmountable, but because they signal legal ritual and Dona Maria has understood, correctly, that she wasnât invited to the ritual. She closes the browser. The social oversight she could have exercised â one of those small civic vigilances that sustain control over administrative acts â did not happen.
The spine question of the whole post fits in one sentence: which of the two does the anonymization work against?
Against Dona Maria. For Robson, an asterisk is challenge accepted.
The hacker from Araraquara
For the case in which Robson canât close it through web triangulation â stubborn homonymy, a servant with a clean digital presence, a target whose CPF was never published anywhere â thereâs no need to invoke a new category. Itâs the same Robson, with more tenacity and more free time. We can call him the hacker from Araraquara, in honor of the character from Brazilian political folklore who was moved to open prison last week. The only difference from Robson is this: this one downloaded, from some torrent, the 2021 Serasa dump â two hundred and twenty million CPFs with full name, date of birth, address and motherâs name, indexed in some SQLite file on an external drive. In any hard case, he resolves it in fifteen seconds.
The technical ceiling of the non-state, non-Big-Tech Brazilian adversary has a name, a criminal record and an ankle monitor â and is, materially, the Robson from the previous paragraph with more stubbornness. The handbookâs barrier never even reached Robsonâs level.
flowchart LR
M["Dona Maria"] -. "stopped by asterisks" .-> X["â"]
R["Robson"] -->|"10 minutes"| ID["unique<br/>identification"]
R -. "+ stubbornness<br/>+ Serasa dump" .-> H["hacker from<br/>Araraquara"]
H -->|"15 seconds"| ID
The PET bottle on top of the meter
Before going further, a concession is owed to the seriousness of ritual in general. Brazilians love mandinga, simpatia, the gesture incorporated into practice â and it isnât always silliness. Joseph Henrich, in The Secret of Our Success (Princeton, 2015), spends an entire book showing that apparently arbitrary cultural practices â food taboos, manioc-processing techniques, divination to choose where to hunt â frequently encode adaptive information accumulated across generations of selection, even when the practitioner canât articulate why. The ritual is memory inscribed in repetition, and to respect it is to respect that memory.
Until the early 2000s, in almost every Brazilian residential neighborhood, there was a gray box on the front wall of the house â the padrĂŁo de energia â with the utilityâs electricity meter inside, usually locked. On top of that box, it was common to see a two-liter PET soda bottle full of tap water, lying down or standing up. The popular theory was that the water did something to the meter â held it back, slowed the dial, confused it, nobody quite knew. It didnât. Water has no opinion about the meter. But the bill did come out cheaper. The bottle worked through another path: seeing it every day on the way out of the house reminded the family to turn off the living-room light, close the laundry tap, unplug the iron. The ritual was false in physics and true in psychology. It worked by mistake, but it worked â and it worked without an audience, because it was the family reminding itself.
The next category came with a technical name: security theater, coined by the cryptographer Bruce Schneier in the early 2000s to describe public protection rituals whose real function is just to display that a protection is being executed. The shoe inspection at airports is the canonical example. It doesnât stop a terrorist, but it has an audience: the passenger sees the protection being performed, the auditor records it, the press reports it. Ritual faces inward; security theater faces outward.
The asterisk in the Official Gazette is all three at once. It is ritual: an entire sector adopted it through belief incorporated into practice. It is a PET bottle: a piece of technical folklore that misjudged the physics of the CPF. And it is security theater: it was imposed for a generic auditor â the legal office, internal control, the citizen who counts asterisks. It fails as ritual because it has no Henrich-style ballast: it accumulated zero generations, was adopted by bureaucratic imitation in four years, with no adaptive information encoded. It fails as theater because the audience has already learned to count asterisks and knows a thousand candidates remain. And it fails as a PET bottle because it lacks even the reminder side effect: whoever produces it is thinking about formal compliance, whoever reads it thinks ah, anonymization, and moves on to the full name right next to it.
The other 843 Franklin Silveira Baldos and I publicly thank you for hiding the 7, the 6 and the 4 of my CPF right after stating each one of our full names.
The Brazilian ritual normally pays the price of technical uselessness with the profit of psychological effect, or at least with the performative profit before an audience.
This one neither pays nor profits.
It isnât security theater. Itâs theater of security theater.
And hereâs the economic reason for the irritation: even if the asterisk were ritual in Henrichâs sense, or theater in Schneierâs sense, it would still have to pay for what it costs on the other side of the scale â the friction it adds to transparency. Each asterisk raises the cost of verification for the citizen, the journalist, the researcher, social oversight. That cost isnât zero; itâs the price charged in the name of a protection benefit that, as weâve seen, doesnât exist. Ritual without adaptive ballast, theater without a convinced audience, and in exchange verification gets more expensive for those who should be able to verify. There is no benefit that compensates. The added friction to transparency is unjustified â not in the legal sense, in the arithmetic sense: nothing on the positive side of the ledger covers what was spent.
The self-contradicting handbook
The production of the handbook has its own sociology, and the first absurdity is that there isnât the handbook â there are hundreds. No unified technical guidance came out of the National Data Protection Authority. No general normative instruction came out of the federal government. No directive that the whole public sector could follow came out of any central body. Instead, in every autarchy, every court, every state secretariat, every professional council, every public university, a data-governance committee of its own was formed â people from legal, from the chief of staffâs office, from IT and from communications. Each of these committees meets. Each produces, in some quarter, a document titled, with discreet local variation, Best Practices for Anonymization of Personal Data in Administrative Acts. Itâs between four and twelve pages long, it bears the bodyâs coat of arms, some grounding in the LGPD, and a final section with masking examples. The invariably recommended example is ***.XXX.XXX-**. The handbook is approved by ordinance. The ordinance is published in the Official Gazette. In that same Official Gazette, three pages later, someoneâs retirement act appears with the full name and the chopped CPF.
Hundreds of independent committees, in parallel, over years, worked to arrive at the same wrong answer.
The kind of institutional productivity only Brazil can pull off.
A small pull-of-the-credentials, low risk: my masterâs thesis was on administrative transparency. Itâs not a noble title; at most, it authorizes a technically qualified irritation with the normative PET bottle.
Thereâs a detail that makes the picture less elegant and more human. Whoever is reading this is, with decent probability, the very person who wrote one of these handbooks â someone from legal, from the chief of staffâs office, from IT, from the governance committee. I wonât slander the room. I know several of these people. They take the job seriously, they read the LGPD end to end, read the ANPDâs opinions, made time in the day for the mandatory course, and designed the masking on the most defensible reading of what looked like a legal requirement. The point here isnât good faith. Itâs the target.
The LGPD was written against Cambridge Analytica, against the data brokers, against the firm that crosses twenty sources and predicts voter behavior. It was not written against Dona Maria who opens the Official Gazette to check on the retired neighbor. Dona Mariaâs reading is fofoca â neighborhood gossip â and gossip, in public office, has a function. It is the local, free, decentralized form of social oversight: the citizen looking, commenting, asking the councilman, writing to the paper. Henrich, already cited, would describe this as a cultural ritual with ballast: neighborhood gossip has accumulated generations of people checking on each otherâs lives, and what selection left behind is one of the cheap infrastructures of Brazilian accountability. When the handbook aims at Dona Maria â without realizing itâs aiming, because the adversary appears abstract in the text of the law â, it doesnât err only in picking a harmless target. It errs because it eliminates, unintentionally, one of the few cultural practices that actually sustained the very oversight the handbook claims to practice. It was written to protect the citizen. It took from the citizen the tool the citizen used for oversight.
To measure the depth of the reflex, I asked a commercial language model for editorial feedback on this essay. The poor thing, trained on terabytes of Brazilian public text post-2018, recommended â with the best intentions â that I anonymize my own name in the 843 joke, because citing a real name next to a partially masked CPF could, according to it, expose the specific person. That specific person was me â signed author of the post, with my name in the canonical, in twitter:creator and in the browser URL. The handbook has even contaminated the synthetic reader, to the point that the ritual now tries to protect the victim from the explicit source of the information. It left Porto Velho, crossed the Pacific, was trained on some server in California, and came back intact in the form of well-meaning editorial advice. The ritual found a way to propagate itself even without committees.
What the LGPD actually says
The LGPD defined anonymization in art. 5, item XI, with words that donât admit the Brazilian use of the term:
Anonymization: the use of reasonable and available technical means at the time of processing, by which a datum loses the possibility of association, directly or indirectly, with an individual.
A thousand candidates crossed with full name, position, posting and two decades of indexed Official Gazette do not constitute a datum that has lost the possibility of association. Robson is not an unreasonable technical means. Heâs a gas-station tech with Python. The legal definition of anonymization is generously broad, and even so the Brazilian practice doesnât fit inside it.
The verb in the definition is specific: loses the possibility of association. Doesnât make it harder. Doesnât make it more expensive. Doesnât discourage the curious. Loses. The LGPD adopted a binary definition â either the datum was in fact disconnected from the subject, or it wasnât. There is no intermediate regime, there is no half-anonymization. Tricks that make reidentification trivial for any Robson donât meet the legal hypothesis: they donât even try. From the privacy side, then, the chop has nothing to stand on.
That leaves examining it from the opposite side: transparency. The LGPD provides, in art. 23, a specific hypothesis for the processing of personal data by the public power, articulated with the Access to Information Law, whose art. 8 defines the catalog of active transparency â salaries, personnel acts, contracts. The Constitution, in art. 37, caput, makes publicity a guiding principle of public administration. The Supreme Federal Court, in ARE 652.777 of 2015, decided that the nominal disclosure of civil servantsâ salaries is a legitimate consequence of that principle. The legal system, in other words, has already made its choice in favor of transparency for civil-servant administrative acts â and the chop of the CPF operates below that choice, raising the cost of verification for those who should be able to verify. It doesnât anonymize because it canât. It gets in the way because the full name right next to it summons a verification that the chop makes harder for no reason. It does the worst of both worlds, and does it firmly.
The missing mens legis
The LGPD was not conceived in Brazil. It is, to a large extent, the Brazilian cousin of the European General Data Protection Regulation â the GDPR, written in 2016 and in force since 2018. The GDPR did not come from a legislative vacuum: it came, in considerable part, from the political response to the growing perception, throughout the 2010s, that some companies were concentrating a disproportionate informational power. The Cambridge Analytica scandal, in 2018, gave name and face to that perception â Facebook revealed it had exposed the data of eighty-seven million users to a political consulting firm that used them for electoral microtargeting, in an episode that ran through the Brexit campaign and the 2016 American election. The GDPRâs legislative work was already under way before the scandal; Cambridge Analytica gave the popular name to what was being regulated. The LGPD, two years later, reflected the same motivation.
What happened on the way from the law to the handbook is a form of transference. The companies that originated the concern keep operating essentially as they operated. Systemic leaks cross the Brazilian landscape without provoking a proportional institutional response. Serasa leaked some two hundred and twenty million CPFs in 2021. INSS records have appeared on forums for years. The telemarketer who calls during our lunch break knows the exact value of our last bill, and weâve given up asking how he knows. The LGPD exists while all of this happens. But the part of the LGPD that actually bites â that generates committees, handbooks, training sessions, internal disciplinary actions, removal of useful information from public databases â is the part that squeezes the least dangerous agent in the system: the front-desk servant, the academic researcher, the local journalist, the citizen overseer.
It isnât necessary to attribute systemic bad faith to anyone for this to happen, and I donât. Someone said it once â Hanlon, the razor that bears his name â that one should not attribute to malice what stupidity adequately explains. Wise principle. Thinking about it, however: maybe bad faith from the legal office that writes the handbook knowing it has a login to the database and doesnât need the chop to identify anyone. Maybe bad faith from the governance committee, which justifies its own existence by producing, quarterly, the same PDF with an updated coat of arms. Maybe bad faith from the data protection authority, which has not issued a general normative instruction of application because the ambiguity preserves its own discretion. Maybe bad faith from the average administrator, who prefers the displayable formal protection to the invisible substantive one because itâs the displayable that counts at audit. Maybe bad faith from the compliance industry, which makes a living training committees to write handbooks. Maybe bad faith from the incentive system, which punishes under-masking and never over-masking. Maybe bad faith from Big Tech and the data brokers, who watch in silence as the LGPDâs ruler tightens around the front-desk servant and passes far from them. The list goes on. Hanlonâs razor, applied patiently, often reveals a series of small malices working in sync, and the aggregate result is indistinguishable from systemic bad faith. The ritual survives on its own â sustained, we now see, by exactly those small malices coordinated without meetings. The handbook is displayable. Internal segregation of duties isnât. The asterisk is the visible mark of compliance, and thatâs why it multiplied.
flowchart TD
Q["Whom the partial<br/>asterisk doesn't stop"]
P["Whom the partial<br/>asterisk stops"]
Q --> BT["Big Tech / data brokers"]
Q --> H["hacker from Araraquara"]
Q --> R["Robson"]
P --> DM["Dona Maria"]
The honest alternative
The honest technical path for civil-servant administrative acts is simple and old. Either you publish by name what the Constitution wants public â name, position, posting, legal grounds, value of the benefits â and accept that oversight is, in part, popular; or you actually protect what needs to be protected â health, dependents, banking data, home address â through segregation of duties, access logs by registration number, periodic auditing of internal queries and mechanisms that detect patterns of inappropriate curiosity in database access. The two operations are compatible: the first is publicity, the second is protection. The asterisk in the Official Gazette is neither. It is a third thing, which looks like the second while undoing the first â a door with a lock that opens for Robson and bolts shut against Dona Maria.
The asterisk in the Official Gazette doesnât hide a person. It hides who is allowed to look at her. Robson is looking.
Further reading
- Law no. 13.709/2018 (LGPD), art. 5, XI â the legal definition of anonymization that Brazilian practice fails to meet.
- Latanya Sweeney, k-Anonymity: A Model for Protecting Privacy (2002) â the canonical paper, with the finding that three demographic attributes uniquely identify roughly 87% of American citizens.
- Arvind Narayanan and Vitaly Shmatikov, Robust De-anonymization of Large Sparse Datasets (2008) â the Netflix Prize, empirical proof that âanonymizedâ datasets frequently are not.
- Paul Ohm, Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization (UCLA Law Review, 2010) â the American legal essay against the illusion of perfect anonymization.
- Bruce Schneier, Beyond Fear (2003) â the book in which the expression security theater first appears, and the systematization of what is real vs. performative protection.
- Joseph Henrich, The Secret of Our Success (Princeton, 2015) â on why cultural rituals deserve respect: they often encode adaptive information accumulated by selection, even when the practitioner doesnât know why. The asterisk is the counterexample: ritual without ballast, adopted in four years by bureaucratic imitation.
- STF, ARE 652.777/SP (2015) â the nominal disclosure of civil servantsâ salaries as a consequence of the constitutional principle of publicity.
- Law no. 12.527/2011 (LAI), art. 8 â active transparency as a duty of the State, taking priority over the privacy of the public agent in the exercise of office.
- Wikipedia entry on Walter Delgatti Neto â the hacker from Araraquara as documentary character: the average Brazilian technical ceiling has a name, an address, a criminal record and an ankle monitor.
- Jorge Luis Borges, Funes el memorioso â on what happens when the database doesnât forget.
Footnotes
-
The reader who clicked this footnote is probably also the reader who would write the fifteen lines of Python. The CPFâs two check digits are defined as follows: given the nine-digit prefix
dââŠdâ, you compute the weighted sumsâ = 10·dâ + 9·dâ + 8·dâ + ⊠+ 2·dâ, take the remainderrâ = sâ mod 11, and the tenth digitDâis11 - râ, with the convention that it becomes0whenrâis less than 2. The eleventhDâis defined analogously, with weights from 11 down to 2 applied todââŠdâand the freshly computedDâ. The operation is deterministic and cheap. It runs silently inside any system that validates a CPF â banks, tax returns, forms â and has done so for decades. Hiding the last two digits is like hiding the result of a sum whose every term is in plain sight. â©
Related posts
Three Hammers Walk Into a Bar
On three professional postures, four alignment properties, and the one property that had to come from elsewhere.
The Serpent's Egg
The duty of rationality is incompatible with judicial patrimonialism. Article 489 of the Brazilian Civil Procedure Code of 2015 is that serpent's egg â incubated inside the patrimonial system, by the hands of its most eloquent representative, without him realizing what he was hatching.
The Art of Delegation: Signatures and Sandboxes
Why the problem with autonomous agents is not micromanagement, but the administrative distinction between drafting the act and signing it.
Comments
Comments not configured yet.