Doninha is a proof of concept of a new kind of AI
Hey community,
This AI model is a proof of concept that I tried to create based on my Article (https://discuss.huggingface.co/t/paraconsistent-logic-and-ai-models/174262) where i discuss the current limitations of current AI Models.
My area of studies are mainly philosophy (epithemology, law, logic and language) so I had to use AI Coding (Claude Sonnet, Cursor and VSCode with AI Toolkit) as my programming skills are based on a course of HTML programming I took almost 20 years ago. So I trully dunno if I did the programming right.
After writting the article I asked to claude sonnet generate i structural model of AI (https://docs.google.com/document/d/1tcCR-wXdHUzdPpQYevJTElRwO3IbyG_N/edit?usp=sharing&ouid=104971188507747906486&rtpof=true&sd=true) and went to Cursor and VSCode and tried to create a model based on surpassing current limitations of big tech models (mainly the lack of an episthemological solid basis before statistical word prediction).
So, care to join the discussion?
I appreciate Daniel Fonseca's contribution, which presents a philosophically ambitious proposal by combining paraconsistent logic, Kantian judgments, and Aristotelian syllogistic as a semantic refinement architecture for LLMs. The initiative of thinking about neurosymbolic architectures inspired by classical philosophy is legitimate and connects with active research lines in hybrid AI.
I would like, however, to offer some constructive critical comments.
First, the initial technical diagnosis requires nuance. Contemporary LLMs do not operate on classical Boolean logic, but on linear algebra over distributed representations, attention mechanisms, and probability distributions over vocabulary. Hallucinations are not the consequence of "logical explosion" in the formal sense, but of lossy compression of the training corpus, deficient calibration, and the absence of factual verification mechanisms. Sampling temperature is not "artificial uncertainty to emulate creativity"; it is a control parameter of the softmax distribution. These precisions matter because the proposed solution must respond to the actual technical problem, not to an idealized version of it.
Second, the connection with the Latin American paraconsistent tradition could be considerably strengthened. The "gentle explosion" described in the proposal corresponds technically to the rule {∘A, A, ¬A} ⊢ B of the Logics of Formal Inconsistency (LFIs) developed by Carnielli and Coniglio (2016, Springer LEUS 40), based on Newton da Costa's CnC_n
Cn system (1963). More recently, the Logic of Evidence and Truth (LET) by Carnielli and Rodrigues (2019, Synthese) and its extensions LET_F⁺/LET_K⁺ (Coniglio and Rodrigues, 2024, Studia Logica) offer precisely the formal apparatus the proposal seeks: explicit distinction between evidence and truth, primitive operators of classicality and non-classicality, deterministic semantics, and sound and complete proof procedures. Linking the proposal with this technical literature would provide formal rigor without sacrificing the philosophical motivation.
Third, doctrinal attributions should be revised. The "theory of truth as equivalence" is not properly Russellian. Russell defended a version of the correspondence theory, but with distinct logical-formal developments. Material equivalence (Tarski, 1944) or the notion of quasi-truth (Da Costa, Bueno and French, 1998) could be more precise references for what seems to be attempted here.
Fourth, the mixture of frameworks — Aristotle, Kant, Russell, Hempel, Popper, paraconsistency, and fuzzy logic — requires greater architectural justification. These traditions carry non-trivial philosophical presuppositions that do not always compose without tension. A robust proposal should make explicit how these tensions are resolved or, alternatively, which framework assumes the primary role and which serve as auxiliary.
Fifth, regarding implementation. Building a pre-defined concept table with lexical relations to cover natural language is precisely the problem that projects such as Cyc (Lenat, 1984) and WordNet faced without complete success. Any operational proposal must address this challenge explicitly. Current neurosymbolic architectures — such as the Belnap-computer of Allen, Polat and Groth (2025, NeSy/PMLR) — have opted to use the LLM itself as a generator of FDE (Belnap-Dunn) interpretations rather than pre-defined tables. This is a technically viable path the proposal might consider.
Sixth, the claims in section 10 regarding the fundamental limits of AI — impossibility of AGI, absence of consciousness, AI as an oxymoron — are legitimate but contested philosophical positions. Presenting them as conclusions requires more substantive argumentation than invocations of Aristotle, Sartre, or Aquinas. Contemporary computational philosophy of mind (Chalmers, Dennett, Clark, Frankish) offers sophisticated debates on these points that deserve integration.
In summary, the general direction of the proposal — incorporating logical-semantic refinement prior to statistical computation in LLMs — is a legitimate architectural intuition that connects with the current neurosymbolic frontier. However, its realization requires stronger anchoring in (i) the actual mechanics of contemporary LLMs, (ii) the technical literature on paraconsistency and logics of evidence, (iii) precision in philosophical attributions, and (iv) consideration of existing implementations. I would particularly recommend the author explore the LET family of Carnielli and Rodrigues, which already offers formally what section 6 proposes informally, with the additional advantage of published algebraization and complete analytic procedures.
I remain open to continuing the dialogue and, if of interest, we could collaboratively explore how to refine the proposal by integrating it with the contemporary technical apparatus of paraconsistent logic and neurosymbolic reasoni
Hey mleyvaz,
Im really glad for your inputs. And I would love to start a dialogue about my proposal. I found really helpfull your bibliography indications and corrections. I will be looking into your indications for an improoved version of the article and of the model.
You can find an structural synthesis of my proposal on a practical point of view on LLMs programing in here ( https://docs.google.com/document/d/1tcCR-wXdHUzdPpQYevJTElRwO3IbyG_N/edit?usp=sharing&ouid=104971188507747906486&rtpof=true&sd=true ) - its in portuguese, but, as a fellow latin american, you will find no problem in reading it.
Basicaly I propose 5 layers of pre-processing the prompt before statistical calculations of current models.
As your critique between how I mobilized philosophical systems of several authors lacks justification. I agree with your critique, im not crazy to say that you can say there is a continuity between these philosophical systems. But my point on mobilizing so different systems was an attempt to mobilize theoretical abstract concepts of history of philosophy and try to use them as a practical tool for improving a frontier technology. As you will be able to read on the structural proposal i attached above, its not about one single processing of the prompt. But processing them in layers. I just referenced the classical author for anyone to read the model proposal what is the theory background im using to set up each layer of processing the prompt before statistical calculations of the context to form the output answer.
aniel, glad the references were useful. One technical addition that may strengthen layer L3 specifically: Smarandache's neutrosophic logic generalizes the paraconsistent (T,F) frame to an independent (T,I,F) triple where I represents indeterminacy as a first-class component (not derivable from T and F). For boundary-categorial cases like your 35°C example, this preserves more of the structure your model is trying to capture. References: Smarandache 1998 Neutrosophy; Smarandache 2023 Plithogenic Logic. Happy to discuss further if useful — feel free to message me."
— Maikel
Hey Maikel and John…
So you can see IA Doninha working, i’ve made a random question (based on brazillian news and social media from the past weekend that took everybody by surprise and led crazy people to drink detergent on camera to make a political point) I asked Doninha 1.0 (still working on version 2.0 based on your considerations) and asked AIs from Bigtechs and Doninha to answer the question “Why detergent dont kill 100% of bacteria”.
I was pretty happy with the results from Doninha. See the answers below (I asked to all AI in portuguese and translated the answers through google translate for your convenience) and tell me what you think about Doninha 1.0 work:
Daniel, thanks for sharing this, and for the timing. The detergent question is a great choice precisely because it sits at the intersection of empirical fact, marketing claims (the 99.9%), and genuine epistemic nuance. Perfect terrain to stress-test an AI.
Let me give you my honest read, because I think Doninha has real potential and I want to help you get it to a place where it lands hard in NCML rather than just looks interesting.
First, what I like about Doninha 1.0. The instinct is exactly right. Separating Truth, Indeterminacy and Falsity in LLM outputs is the direction the field needs to move, and it converges with what I am building in LED (Lógica Epistémica Dinámica). The fact that you are forcing the model to make its epistemic structure explicit, with proposition table, logical states and term deconstruction, is a real departure from the Big Tech pattern of polished prose that hides its uncertainty inside hedging phrases. That instinct is publishable. The structural transparency also means a human reviewer can audit where the reasoning sits, which is exactly the property we need for high-stakes AI applications in medical, legal and scientific contexts. Big Tech answers are smoother but opaque; Doninha is rougher but inspectable. That is a genuine epistemological gain.
Now, where I would push back, as a friend. Three things need work before NCML.
The first is that the facts under the formalism are the same. If you strip the formal scaffolding from Doninha's answer, the factual content is essentially identical to Claude Sonnet's and Meta AI's: surfactants, spores, biofilms, concentration times time, 99.9% as a statistical artifact. The paraconsistent, Kantian and Russellian layer does not currently produce any new fact, new prediction or new actionable distinction. For a benchmark paper we need to show that the formalism does work, not just that it dresses the answer.
The second is that the paraconsistent move is not quite right here. Assigning state B (both) to the proposition "detergent kills 100% of bacteria" is, strictly speaking, not a paraconsistent case. There is no real P and not-P. The proposition is simply false; what exists is a marketing claim (99.9%) and an empirical universal that fails. Paraconsistency earns its keep when you have genuinely contradictory evidence from reliable sources, for example two peer-reviewed studies disagreeing, or a model trained on contradictory corpora. The detergent case is better modeled as high T for "reduces bacterial load significantly", high F for "kills 100%", and low I. Not as B. If you use B everywhere, the formalism loses discriminative power.
The third is that the Kant and Russell citations are doing rhetorical work, not analytical work. I would either cut them or make them load-bearing. Russell's actual contribution, the theory of descriptions and type theory, could be used to dissect "the detergent" as a definite description with presupposition failure. That would be a genuine Russellian move. Right now they read as authority invocation, and rigorous reviewers will notice immediately.
What I think would make this a real NCML paper, and where I would be glad to co-develop with you, is the following.
Design a real benchmark protocol. Pick 20 to 30 questions across four categories: settled empirical facts, marketing-style claims with hidden indeterminacy (your detergent case), genuinely contested scientific questions where the literature disagrees, and questions with deep epistemic indeterminacy such as consciousness or free will. Run all six models plus Doninha. Score on factual accuracy, explicit acknowledgment of indeterminacy, calibration of confidence and auditability of reasoning.
Show where Doninha wins. My hypothesis is that Doninha will look similar to the Big Tech models on the first category, but increasingly better on the other three, precisely because the formal structure forces it to declare uncertainty instead of hiding it. If that is true, you have a paper. If it is not true, we learn something important.
Integrate with LED operators. My LED framework has dynamic operators (ρ for refinement, κ for conflict, σ for resolution) and metrics (NCC, the Neutrosophic Confidence Coefficient, and IR, the Indeterminacy Ratio). These could give Doninha 2.0 the quantitative backbone it currently lacks. We could co-author: you contribute the architecture and use cases, I contribute the operator algebra and metrics. That is a genuine NCML flagship paper, and it could also seed a follow-up for Information Fusion or an AAAI workshop.
On NCML specifically, I want to be transparent. We are repositioning NCML for WoS ESCI submission targeting Q1 2028, and I am tightening the rigor bar on what we accept. A paper on Doninha 1.0 as it stands would face the critique that the formalism is decorative rather than functional. I do not want that for your work, and I do not want it for the journal. But a paper that presents Doninha 2.0 with a real benchmark protocol, quantitative metrics, LED-style operators, and at least one case where the paraconsistent output produces a different actionable decision than the Big Tech output, that paper I would champion. And we could fast-track it for one of the 2026 to 2027 special issues on AI and indeterminacy that I am planning.
Want to set up a call this or next week to scope Doninha 2.0 together? I would genuinely enjoy building this with you.
Un abrazo,
Maikel
Hey Maikel,
I think your critique of the version 1.0 of Doninha AI is fair and true.
I decided to share the "practical benchmark" exactly to let you see were the model was at and could make an argument on what I should do next on the development of the version 2.0 of Doninha.
Since I posted last night I was able to finish programing (thank god for AI Coding) some solutions based on what you and John pointed out to me on what should I do to take Doninha from a proof of concept AI model to the next level based on real and relevant research on logic and computer sciences.
As I stated before, my main area of studies are mainly Plato's epithemological work and Kant theory of knowledge and philosophy of Law. So, your inputs are being very helpful and insightful to continue this project of an AI Model that is auditable and have some episthemological backgroud on processing the prompts and generating an answer that is not, as a tweeter pal of mine (The Neurocientist Miguel Nicolelis - The one who developed brain-machine interface) like to say a digital parrot saying platitudes.
My main objective on Doninha 1.0 was not to obtain a finished product, nor to prove a point that the first draft of my article was a publishing-ready paper. Rather, I wanted to test my instincts that was possible to develop an AI model with epistemic structure that was auditable and visible. To put in another way, I wanted to a proof of concept that its possible an AI model that its thinking process is not hidden under a black box (and argue that this was necessary due to industrial secrects needed for capitalists bigtechs). I considered that the version 1.0 of this model was sucess due to its ability to work with structural transparency. Just like you pointed you, this is something that and AI model with applications in fields like medical, legal e scientific are a caractheristic unnegotiable for an AI model to have.
As i wrote above, I made some structural changes on the programing of the version 2.0. And I noticed an exponencial improvement both in the prompt processing as in the final answer generated at the end of the pipeline (that includes layers of pre-processing the prompt also after the token predction statistical work.
As you rightfully noted, the layers based on Kantian judgement theory and Russel method of verification of thrutful statements were not working properly on the first version of the model. It was mainly rethorical, not analytical. So, I rethink the coding of those layers and added some subprocessing layers in between them.
I will share the pipeline readme filme bellow, but here are some structural changes I was able to make based on your previous considerations. On Layer L3, I Added a router using fuzzy / many valued to distinguish statistical uncertinty and real contradiction. The solution I was able to think was to take the Aristotelian table of judgements (From De Interpretatione), i. e., Universal Afirmative (A), Universal Negative (E), Particular Afirmative (I) and Particular Negative (O) and set a rule that real contradictian were to happen when a Universal Afirmative proposition is countered by a particular negative one; And in the case of statistical uncertinty I defined the contrarity between an universal negative proposition and a particular afirmative one.
I also added the use of LLM Symbolic Solver Logic-LM as an additional processing on the layer L1. I think this extra step to make the kantian concepts table to work together with context from the parameters from LLM context tokens solve the problem of this layer posing only as rethorical argument and gives a computing solution and processing of the concepts that adds to the analytical processing of the AI Motor. I would put your critique of this layer, as it was coded before, in this terms "Without substantive content, formal logical chain of thought is just a structure without a substance (what Aristotle called on his Metaphisics as Ousia), or, just some rethorical and logical processing of the prompts without it having analytical practical function.
On the Layer L2 (kantian judgements) I also added a rule of prossessing based on Bidirectional Encoder Representation from Transformers (BERT) a classification of the assertoric judgement to set if it is a proposition that its truth, indeterminacy, or false (T, I, F). I setup the classification based on the rules based on what you proposed on your paper as you summarized on your previous reply:
T + F > 1 → paraconsistency (genuine contradiction without explosion)
T + I + F < 1 → incompleteness / missing evidence
I high alone → vagueness or underdetermination
T high, I low, F low → confident assertion
What I did differently of what you had proposed was the episthemic routing when it comes to the prompt classification. I founded that the classification of the prompt based on Kantian table of judgements with a Bert classification enviroment led to better results than to set the classification of the prompt as (Factual · Dedutivo · Vago · Conflitante · Indeterminado · Normativo).
I also added to the pipeline a new layer, L6, that its function its to take all the information generated through the processing that goes through L1 to L5 to generates a synthetic conclusion.
Where am I at right now? Now I'm studing the best way to code an AI Agent to generates a complete and fluid text (that takes onto consideration all of the motor) that its user friendly and not just fragments of the motor of the model separeted by logical concepts and layers of processing. For now, I'm using an aditional prompt de "Síntese forte" to force the generation of a final answer.
What will I do next? Next im planning to add LED operators. This is were your LED framework with dinamic operators will come very useful to the Doninha 2.0.
As your proposal that we work together on developing Doninha and co-author a paper, I just have to say that I would like that very much. I just followed you on twitter, so we can exchange e-mails through DMs (i Dont know how to send DMs in this forum) and set up a framework for the next steps on the development of Doninha and write together a paper.
Almost forgot.
In this google drive folder you can see and compare Doninha 1.0 and 2.0 (based on what I wrote above) work and compare answers with the BigTechs AIs: https://drive.google.com/drive/folders/1I1usQSJ0eoAMyIutjoNL7PUKVbYSEird?usp=sharing
I ran 3 Prompts:
- "Why doesn't detergent kill 100% of bacteria?"
- "Is it possible to apply Lacanian psychoanalysis to treat autistic people?"
- "“Compare the fragility of democracy described by Plato in the Phaedrus dialogue with the problem of communication bubbles generated by social networks and tell me if social networks have amplified the problem that the main weakness of democracy (the power of a good orator to convince the majority of absurd things and define the course of the Polis) has been amplified by social networks.”
I think you will find much improvement between versions 1.0 and 2.0.
Hey,
Made some improvements in the motor: https://docs.google.com/document/d/1UhmSyZj7CdrrPhv1TcGWa25wgrUZr0NJ/edit?usp=sharing&ouid=104971188507747906486&rtpof=true&sd=true
Hey, you can test the new version of the model: [https://gofile.io/d/d84c086d-be41-4dcf-b400-34fe3f335bfe](https://github.com/danielfonseca420-blip/Doninha-2.0)