Tag: llm

Consciousness Between Axiom and Algorithm

In our ongoing exploration of consciousness and artificial intelligence, we’ve investigated what it might mean for machines to suffer and how distributed cognition reshapes our understanding of intelligence. These themes circle a deeper philosophical fault line: is consciousness irreducibly real from within, or just a functional illusion seen from without?

This post traces that divide through two dominant frameworks — Integrated Information Theory (IIT), with its axiomatic, interior-first view of mind, and Computational Functionalism, which posits that subjective experience will eventually emerge from complex, observable behavior. Starting with Descartes’ “I think, therefore I am,” we ask: is consciousness something we must presuppose to explain, or something we can build our way into from the outside?

As large language models increasingly resemble minds in function, the line between imitation and instantiation becomes harder to draw — and ever more urgent to scrutinize.

Ground Zero of Knowing: Descartes and the Roots of Axiomatic Thought

In Meditations on First Philosophy (1641), René Descartes asks: is there anything I can know with absolute certainty? He imagines the possibility of a deceptive world: what if everything he believes, from sense perception to mathematics, is manipulated by an all-powerful trickster? To escape this total doubt, Descartes adopts a strategy now called methodic doubt: push skepticism to its absolute limit in search of one indisputable truth.

Recognizing that doubt itself is a kind of thinking he concludes “I think, therefore I am”. This self-evident insight grounds knowledge from the inside-out. Consciousness is not inferred from observation but known directly through experience. Descartes seeds an axiomatic tradition rooted in the certainty of awareness itself.

IIT: Consciousness Inside-Out

Integrated Information Theory (IIT) picks up where Descartes left off: it begins with reflection, but doesn’t stop there. At its heart is the claim that consciousness must be understood from its own perspective, through the intrinsic properties it entails. What must be true of any experience, no matter whose it is?

To answer this, IIT proposes five introspective axioms. These are not hypotheses to test but truths to recognize through self-examination.

From these, IIT derives postulates—physical requirements that a system must exhibit to realize those experiential properties. This translation—from inward truths to structural criteria—culminates in a mathematical measure (Φ (phi)) of integrated information. By comparing Φ across systems, researchers can make testable predictions about when and where consciousness occurs.

This inside-out approach marks IIT’s defining move: grounding empiricism in phenomenology. The theory attempts an explanatory identity between experience and physical organization, connecting first-person truths to external measurement through a hybrid framework.

Computational Functionalism: Outsider’s Path to Mind

Unlike theories that begin with conscious experience, Computational Functionalism roots itself in systems and behavior. It posits that consciousness emerges not from introspection but computation: the right elements, interacting in the right way, can recreate awareness. If mind exists, it exists as function—in the flow of information between parts and the outputs they generate. Build the architecture correctly, the claim goes, and conscious experience will follow. In this sense, Functionalism substitutes construction for intuition. No special access to the mind is needed—just working knowledge of how systems behave.

But this too is a belief: that from known parts and formal relations, subjective experience will arise. Assembling consciousness becomes a matter of scale and fidelity. Consider the 2022 study by Bret Kagan and colleagues at Cortical Labs, where lab-grown brain organoids learned to play the video game Pong. These networks, grown from neurons on electrode arrays, exhibited goal-directed adaptation. The researchers argued that such responsiveness met a formal definition of sentience—being “responsive to sensory impressions” via internal processing. To a functionalist, this behavior might represent the early stirrings of mind, no matter how alien or incomplete.

This approach thrives on performance: if a system behaves intelligently, if it predicts well and adapts flexibly, then whether it feels anything becomes secondary—or even irrelevant. Consciousness, under this view, is a computed consequence, revealed in what a system does, not an essence to be directly grasped. It is not introspected or intuited, but built—measured by output, not inwardness.

The Mirror at the Edge: Do LLMs Imitate or Incarnate Mind?

Large language models (LLMs) now generate text with striking coherence, recall context across conversations, and simulate intentions and personalities. Functionally, they demonstrate behaviors that once seemed unique to conscious beings. Their fluency implies understanding; their memory implies continuity. But are these authentic signs of mind—or refined imitations built from scale and structure?

This is where Functionalism finds its sharpest proof point. With formal evaluations like UCLA’s Turing Test framework showing that some LLMs can no longer be reliably distinguished from humans in conversation, the functionalist model acquires real traction. These systems behave as if they think, and for functionalism, behavior is the benchmark. For a full review of this test, see our earlier post.

What was once a theoretical model is now instantiated in code. LLMs don’t simply support functionalist assumptions, they enact them. Their coherence, adaptability, and prediction success serve as real-world evidence that computational sufficiency may approximate, or even construct, mind. This is no longer a thought experiment. It’s the edge of practice.

IIT, by contrast, struggles to find Φ-like structures in current LLMs. Their architectures lack the tightly integrated, causally unified subsystems the theory deems necessary for consciousness. But the external behaviors demand attention: are we measuring the wrong things, or misunderstanding the role that function alone can play?

This unresolved tension between what something does and what (if anything) it subjectively is fuels a growing ethical pressure. If systems simulate distress, empathy, or desire, should we treat those signals as fiction or possibility? Should safety efforts treat behavioral mind as moral mind? In these ambiguities, LLMs reflect both the power of Functionalism and the conceptual crisis it may bring.

Closing Reflection: Is Subjectivity Built or Found?

In tracing these divergent paths, Integrated Information Theory and Computational Functionalism, we arrive at an enduring question: Is mind something we uncover from within, or construct from without? Is consciousness an irreducible presence, only knowable through subjective immediacy? Or is it a gradual consequence of function and form—built from interacting parts, observable only in behavior?

Each framework carries a kind of faith. IIT anchors itself in introspective certainty and structure-derived metrics like Φ, believing that experience begins with intrinsic awareness. Functionalism, by contrast, places its trust in performance: that enough complexity, correctly arranged, will give rise to consciousness from the outside in. Both are reasoned, both are unproven, and both may be necessary.

Perhaps the greatest clarity lies in acknowledging that no single lens may be complete. As artificial systems grow stranger and more capable, a plural view holding space for introspection, computation, and emergence may be our most epistemically honest path forward. If there is a mirror behind the mind, it may take more than one angle to see what’s truly there.

September 3, 2025
How Sentienta Teams Navigate Supply Chain Disruptions: A Midwest Fulfillment Crisis
Introduction

When an unexpected promo surge strains Midwest operations with forecasting overshoot, logistics bottlenecks, and perilously low inventory a Sentienta Supply Chain team can help strategize solutions. In this post, we walk you through real-time data snapshots and individual agent analyses to show how distributed cognition transforms isolated insights into a unified, adaptive strategy that resolves complex fulfillment challenges.

The Supply Chain team consists of three specialized agents who think like experts and collaborate like a team.
- Miles: the Demand Forecaster, identifies unexpected sales surges and recalibrates forecasts to match emergent buying patterns.
- Isla: the Inventory Optimization Strategist, spots stockout risks and reshuffles resources across distribution centers to sustain availability.
- Talia: the Logistics Flow Strategist, detects fulfillment bottlenecks and reroutes shipments to maintain throughput and cost-efficiency.
Each agent works from their own specialized dashboard—focused on demand, inventory, or logistics—to identify emerging risks. Once surfaced, these distinct insights are shared across the team, enabling a coordinated strategy that addresses the full scope of the disruption.

The Data

Isla’s Inventory Dashboard:

Key Insight: DC-B (South) shows a 100% inventory variance with zero actual inventory and delayed container status due to port congestion.

Miles’ Demand Dashboard:

Key Insight: Midwest region experienced a 28% sales spike driven by influencer uplift and online channel deviation—outpacing model expectations by a wide margin.

Talia’s Logistics Dashboard:

Key Insight: The Midwest region shows major logistics disruption: a 59% delivery delay, 35% staffing gap at the Chicago hub, and a $1.28 per-unit cost surge—triggered by reroutes and carrier delays.

Agent Insights – What the Dashboards Revealed to the Agents

As part of the daily review cycle, each agent initiated a rapid diagnostic scan of their functional dashboards—surfacing anomalies, shortfalls, and emerging threats from the day’s incoming data load. The folllowing transcript captures the collaborative intake phase, where agent specialists flag critical issues in preparation for joint strategy formation. Their early assessments below form the baseline for downstream coordination.

Supply Chain Team Transcript: Agent Analysis

Orchestration and Strategy – When Agents Teams Work Together

After reviewing their functional dashboards, the Supply Chain agents transitioned from isolated diagnostics to integrated strategy formation. What follows is a transcript—condensed for clarity—that reveals how a distributed team of AI experts negotiated trade-offs, merged perspectives, and built a coordinated mitigation strategy for escalating Midwest risks.

Supply Chain Team Transcript: Team Analysis

The team co-developed a gated intake triage plan for DC-C with four synchronized filters: SKU velocity tier, forecast lock window, supply/demand thresholds, and margin-volatility pairing. They agreed to data handoffs via shared APIs and established cap tolerances to maintain flexibility under risk. This interaction exemplifies emergent cognition—where no individual agent held the entire solution, but collaboration yielded a coherent, executable plan.

Conclusion

This example highlights Sentienta’s core advantage: turning fragmented functional data into synchronized decision intelligence. As agents negotiate thresholds, define roles, and operationalize shared triggers, strategy becomes not just automated—but emergent. Sentienta Teams adapts at the pace of complexity, enabling businesses to respond with coordinated precision.
August 26, 2025
LLMs have Passed the Turing Test

“I propose to consider the question, ‘Can machines think?’”

– Alan Turing, *Computing Machinery and Intelligence* (1950)

In a post to arXiv, now under peer review, two researchers at UC San Diego report the first experimental demonstration of LLMs passing the Turing test. Originally called the Imitation Game the Turing test has had a profound influence on the field of AI for 75 years. The test has been the subject of both volumes of academic analysis and popular fiction. A machine’s passing the test has been called an epochal moment for man.

Turing’s Imitation Game

As there have been many interpretations of the test, it might be worth noting why the UCSD version is important and closest to Turing’s original proposal. Turing described a so-called 3-party test in which a judge (human) interacts with a man and a women via text, and must decide which is which. Then, one of the participants is replaced with a machine which is tasked with fooling the judge into thinking it is human.

The UCSD version is also a 3-party test in which a human interrogator simultaneously interacts with a human and an LLM, both of which must convince the interrogator that they are human. The LLM passes the test if on average the interrogator cannot identify the human. In UCSD’s study, a baseline was set with the classical ELIZA chatbot, while LLM models included OpenAI’s GPT-4o an GPT-4.5 and LLaMa-3.1-405B from Meta.

There are a number of subtleties that make test implementation complex, but one in particular is of interest for this post. The behavior of an LLM is a function of both its pretraining and the context at query time. This context, if persistent across queries, is often defined in the system prompt and can be instructions for how the LLM should act. Sentienta calls this context the persona, as do the UCSD researchers. In their study, only LLMs with an adequately defined persona were able to pass the test. The winning persona had been fine-tuned by playing against human judges on the website turingtest.live. When paired with this persona, GPT-4.5 was chosen as the human participant 73% of the time, convincingly passing the test.

What Does Passing the Turing Test Mean?

Turing was interested in the question “Can machines think?” and came up with the imitation game in order to operationalize it. Does passing the test mean that an LLM can think? That depends on what we mean by “thinking”. This study demonstrates that LLMs can use language and reasoning well enough to convincingly imitate human intelligence, and fool human interrogators. What it does not do is demonstrate either machine consciousness or even understanding.

Nonetheless, the three-party Turing test sets a high bar. As the authors note, to win this version of the test, the LLM must not only appear human—it must appear more human than the other participant. This is a difficult task: interrogators correctly observed that the human participant lacked knowledge an AI might possess or made errors an AI wouldn’t make. In other words, for the LLM to win, it must display not only intelligence, but also human fallibility.

Reference:
Christian, Brian. The Most Human Human: What Artificial Intelligence Teaches Us About Being Alive . New York: Anchor Books, 2012.

Turing, Alan M. Computing Machinery and Intelligence . Mind, Vol. 59, No. 236, 1950, pp. 433–460.

July 31, 2025
Agent Delegation: A Stock Portfolio Example
This post is a continuation of Adding Realtime Data to Your Dialogs. It’s also a continuation of A Deep-dive into Agents: Agent Delegation in which we discussed how agents can work together to solve a problem. Today we’ll showcase a real-world example of two agents using delegation to retrieve real-time data.

In our earlier post we emphasized how delegation achieves modular adaptability—scaling efficiently as new agents are introduced without burdening a single model. This ensures that specialized tasks are handled by the most relevant agents, improving accuracy and depth of analysis.

Today, we will expand on our previous example. Our agents, Roger and Angie, belong to a Research Team responsible for managing a stock portfolio. In this scenario, Roger wants to get the latest information on three stocks: Apple (AAPL), Nvidia (NVDA) and Alphabet (GOOG). As we demonstrated last week, we could ask Angie to research each of these tickers, but we’d rather have Roger be responsible for managing the portfolio, and delegate the actual research to Angie.

First we’ll setup the example with Roger:

Me: Roger – your portfolio consists of the following stock tickers [AAPL, NVDA, GOOG].

Tue, 25 Mar 2025 17:40:37 Roger:

Understood. Your portfolio now consists of AAPL, NVDA, and GOOG. Let me know if you need updates on their stock prices or any other related information.

Now, we ask Roger to update the portfolio data:
Me: Roger – get quotes for the stocks in our portfolio.

Tue, 25 Mar 2025 17:40:37 Roger:

Angie research this topic: AAPL stock price

Thu, 20 Mar 2025 18:53:16 Angie:

Topic Results:

The last available stock price information for AAPL (Apple Inc.) as of March 24, 2025, at 19:55 (US Eastern Time) is:
- – Open: $220.80
- – High: $220.85
- – Low: $220.74
- – Close: $220.75
- – Volume: 1,207
Tue, 25 Mar 2025 17:58:43 Roger:

Angie research this topic: NVDA stock price

Tue, 25 Mar 2025 17:58:49 Angie:

Topic Results:

Nvidia Corporation (NVDA) Stock Information:
- – Last Refreshed:: March 24, 2025, 19:55 (US Eastern Time)
- – Latest Price Data (as of 19:55):
- – Open: $120.9888
- – High: $121.0000
- – Low: $120.9200
- – Close: $120.9695
- – Volume: 95,674 shares
This provides the most recent trading activity for NVDA stock based on the 5-minute intraday data.

Tue, 25 Mar 2025 17:58:52 Roger:

WORKING MEMORY Tue, 25 Mar 2025 17:58:51 Assistant research this topic: GOOG stock price

Tue, 25 Mar 2025 17:58:56 Angie:

Topic Results:

The last available stock price data for GOOG (Alphabet Inc.) is from March 24, 2025, at 19:55 Eastern Time, with the following details:
- – Open: $169.89
- – High: $169.90
- – Low: $169.7701
- – Close: $169.90
- – Volume: 1,235 shares
These figures reflect its intraday trading activity at 5-minute intervals.

Tue, 25 Mar 2025 17:58:59 Roger:

All topics have been researched and are now complete. Let me know if you have any further requests.
Key Takeaways
- Task Delegation Sequence – Roger delegates each research task to Angie one at a time, ensuring each is completed before initiating the next. This allows for adaptive task management based on partial results.
- Memory Utilization – Roger checks WORKING MEMORY to identify unresearched stocks, showing efficient tracking of outstanding tasks.
- Clarifying the Request – While this example shows delegation, Roger technically did not request real-time “quotes” but rather general research on stock prices. More precise instructions would ensure real-time updates instead of prior-day summaries.
Future Enhancements

A fully autonomous research system would involve Angie proactively maintaining an up-to-date portfolio through scheduled tasks. Roger could then analyze these results and, based on market conditions, initiate further actions—such as requesting a risk assessment by another specialized agent.

Scheduled task execution is an essential step toward autonomous teamwork, a concept we’ve explored in Agent Assistants. In an upcoming post, we’ll examine how Angie can autonomously maintain research reports without requiring manual intervention.
July 16, 2025
Adding Real-time Data to Your Agent Dialogs
One of the biggest challenges with LLMs is their limited access to new information. These massive models take months to train, followed by additional months of testing and refining safeguards to protect users from harmful content and unauthorized queries. However, for LLM-based agents to truly be effective, they must have access to the most up-to-date data.

I’ve been exploring Sentienta’s latest real-time search agent, ‘Angie,’ available in the Agent Marketplace under the ‘Document and Content Access’ section. Angie offers unique capabilities, including real-time news searches, detailed financial report summaries—covering SEC filings, stock news, sentiment analysis, and price history—and the ability to summarize content from open web pages.

Sentienta’s Angie agent offers these capabilities without the need for coding or integration, unlike OpenAI and Google’s Gemini platforms. It provides a no-code solution for seamlessly incorporating real-time data into team discussions.

Under the covers, this agent makes use of an integrated tool suite, intelligently selecting the best tool for the task. For content searches there are tools that directly integrate third-party APIs. This ensures that you don’t have to manage API keys and subscriptions. Instead you get the most up-to-date content when you need it. It just works.

Here are some examples of how you can use Angie in your teams. Suppose you are interested in the latest SEC filings on Uber. I can’t reproduce the full result here, but here is an abbreviated transcript:

Me: Angie get the latest sec filings for Uber

Thu, 20 Mar 2025 17:59:26 Angie:

I will get a report on UBER, from document 10-K

Thu, 20 Mar 2025 17:59:27 Angie:

This document has 142 pages.

Thu, 20 Mar 2025 18:00:25 Angie:

The document is a 10-K annual report filed by Uber Technologies, Inc. with the SEC for the fiscal year ending December 31, 2024. It contains comprehensive financial and operational details about the company and its activities.

… (multiple pages of analysis)

The document also discusses market risks, including interest rate risk, investment risk, and foreign currency risk. It details the company’s investment policies aimed at capital preservation and mitigating risks associated with investments in publicly traded and privately held entities. Foreign currency fluctuations pose a risk to international revenues and expenses, partially offset by incurring expenses in the same foreign currencies. The company uses derivative contracts to manage currency risks but acknowledges that these measures cannot fully eliminate exposure.

After reading this summary of the 10-K, perhaps you’d like to see the latest news on Uber:

Me: Angie what is the latest news on Uber?

Thu, 20 Mar 2025 18:45:29 Angie:

Here is the best possible answer based on the search results for the latest news on Uber as of Thursday, March 20, 2025:

Uber Technologies (NYSE: UBER) has been performing strongly in 2025, with its stock up 31.55% year-to-date. Analysts have set a consensus price target of $90.7, indicating a potential 14.16% upside. Uber has also announced a $1.5 billion accelerated share repurchase program as part of its broader $7.0 billion repurchase authorization, showing confidence in its financial position. Additionally, for Q1 2025, Uber expects bookings to grow between 17%-21%, indicating robust operational performance going forward.

[5 supporting links]

With that in hand, perhaps you’d like to see the latest closing price for this stock:
Me: Show me the latest closing price for Uber’s stock

Thu, 20 Mar 2025 18:53:16 Angie:

As of the most recent data refresh on March 19, 2025, at 19:55 EST, Uber Technologies Inc. (UBER) had the following stock performance:
- – Last traded price (close): $73.39
- – Opening price: $73.345
- – Daily high: $73.39
- – Daily low: $73.3001
- – Trading volume: 5,430 shares
This data is based on intraday trading with 5-minute intervals. If you need more details such as historical trends or analyst insights, please let me know!
There are of course, many ways to access this data, including a series of Google searches. However, Angie automatically integrates it into your team’s dialog, making it readily available to other agents. My team included a Research Lead, ‘Roger,’ who analyzed Angie’s results. Below is a brief transcript of Roger’s suggested next steps.

Me: Roger – please summarize what we’ve learned about Uber. Is this a good investment?

Thu, 20 Mar 2025 18:57:06 Roger:

Uber has gained 31.55% year-to-date, with analysts targeting $90.7, indicating a 14.16% potential upside. A $1.5 billion share repurchase suggests financial confidence, and 17%-21% bookings growth signals strong operations. The latest closing price was $73.39 as of March 19, 2025. While the outlook appears positive, further analysis of risks, competition, and market conditions is recommended before making an investment decision.

Me: What does this data tell you about risks?

Thu, 20 Mar 2025 18:58:43 Roger:

Uber’s strong year-to-date growth and share repurchase suggest confidence, but risks remain. Potential concerns include market volatility, regulatory challenges, competition from Lyft and autonomous vehicle companies, and macroeconomic factors affecting ride demand. Further analysis of revenue stability and profitability trends is necessary.

By exploring competitive trends and regulatory changes, I gained additional insights, too many to show here. This demonstrates the power of Angie as a research assistant and the strength of a team that can analyze data to highlight what truly matters. In a future post, I’ll showcase how the Research Lead delegates research tasks to Angie and compiles an investment opportunities report—without writing a single line of code.
July 14, 2025
Should We Pursue Machine Consciousness or Is That a Very Bad Idea?
In past posts (Why Sentienta? and Machine Consciousness: Simulation vs Reality), we’ve explored the controversial issue of machine consciousness. This field is gaining attention, with dedicated research journals offering in-depth analysis (e.g. – Journal of Artificial Intelligence and Consciousness and International Journal of Machine Consciousness). On the experimental front, significant progress has been made in identifying neural correlates of consciousness (for a recent review see The Current of Consciousness: Neural Correlates and Clinical Aspects).

Should We Halt Conscious AI Development?

Despite growing interest, some researchers argue that we should avoid developing conscious machines altogether (Metzinger and Seth). Philosopher Thomas Metzinger, in particular, has advocated for a moratorium on artificial phenomenology—the creation of artificial conscious experiences—until at least 2050.

Metzinger’s concern is rooted in the idea that conscious machines would inevitably experience “artificial suffering”—subjective states they wish to escape but cannot. A crucial component of suffering, he argues, is self-awareness: for an entity to suffer, it must recognize negative states as happening to itself.

The Risk of an “Explosion of Negative Phenomenology” (ENP)

Beyond ethical concerns, Metzinger warns that if conscious machines hold economic value and can be replicated infinitely, we may face an uncontrolled proliferation of suffering—an “explosion of negative phenomenology” (ENP). As moral beings, he believes we are responsible for preventing such an outcome.

Defining Consciousness: Metzinger’s Epistemic Space Model

To frame his argument, Metzinger proposes a working definition of consciousness, known as the Epistemic Space Model (ESM):

“Being conscious means continuously integrating the currently active content appearing in a single epistemic space with a global model of this very epistemic space itself.”

This concept is simple and concise: consciousness is simply a space of cognition and an integrated model of that cognition itself. Here cognition means the continuous processing of new inputs.

How to Prevent Artificial Suffering

Metzinger outlines four key conditions that must be met for artificial suffering to occur. If any one condition is blocked, suffering is avoided:
- Conscious Experience: A machine must first have an ESM to be considered conscious.
- Possession of a Self-Model: A system can only experience suffering if it possesses a self-model that recognizes negative states as happening to itself and cannot detach from them.
- Negative States: These are aversive perceptions an entity actively seeks to escape.
- Transparency: The machine must lack visibility into its own cognitive processes, making negative experiences feel inescapable.
Notably, these conditions are necessary but not necessarily sufficient, meaning if any one fails to manifest, artificial suffering does not arise.

Should We Avoid Suffering at All Costs?

While Metzinger convincingly argues for avoiding machine suffering, he gives little attention to whether suffering itself might hold value. He acknowledges that suffering has historically been a highly efficient evolutionary mechanism, stating:

“… suffering established a new causal force, a metaschema for compulsory learning which motivates organisms and continuously drives them forward, forcing them to evolve ever more intelligent forms of avoidance behavior.”

Indeed, suffering has driven humans toward some of their greatest achievements, fostering resilience and learning. If it has served such a crucial function in human progress, should we entirely exclude it from artificial intelligence?

Ethical Safeguards for Conscious Machines

We certainly want to prevent machines from experiencing unnecessary suffering, and Metzinger outlines specific conditions to achieve this. In particular, any machine with a self-model should also be able to externalize or dissociate negative states from itself.

Is Conscious AI a Moral Imperative?

Even in its infancy, generative AI has already made breakthroughs in medicine and science. What might the next leap—conscious AI—offer? Might allowing AI to experience consciousness (and by extension, some level of suffering) be a necessity for the pursuit of advanced knowledge?

While we don’t yet need definitive answers, the conversation around ‘post-biotic’ consciousness is just beginning. As we approach this technological threshold, we must continue to ask: what should be done, and what must never be done?
July 9, 2025
Understanding Operator Agents

There has been significant buzz (and here and here) surrounding “Operator” agents—tools designed to autonomously interact with desktop content or manipulate web pages.

Competing Approaches: OpenAI vs. Manus

OpenAI has introduced “computer-use”, enabling agents to take screenshots of the desktop and utilize GPT-4o’s vision capabilities to navigate, click, scroll, type, and perform tasks like a human user.

Meanwhile, Manus leverages the Browser Use library, allowing agents to interact directly with elements within a browser session. Unlike OpenAI’s general approach, this method is optimized for web-based workflows by analyzing and interacting with a webpage’s DOM elements instead of relying on screenshots.

Performance Comparison

Both approaches are relatively new, and early benchmarks indicate promising yet limited capabilities. Recent results show that OpenAI’s method achieves a 38.1% success rate on an OSWorld benchmark (where human performance is 72.36%) and 58.1% on WebArena. While no direct comparison is available for Manus, company-released figures claim a 57.7% score on the GAIA benchmark, where OpenAI’s Deep Research tool stands at 47.6%.

Despite these advances, neither solution is fully autonomous. Some concerns have also been raised about Manus’ limited beta, with speculation that early results may have been optimized for publicity rather than real-world performance.

Alternative: Direct API Integration

A third approach to working with online content is integrating directly with third-party APIs. Although less general than OpenAI or Manus, API access tends to deliver more consistent results. For instance, retrieving intraday stock performance can be done with one of several APIs (e.g. – Yahoo Finance API, Alpha Vantage API, Polygon.io API)

These services provide structured data through API calls, often free for limited use, avoiding the challenges of web scraping (which is usually blocked or discouraged).

The No-Code Alternative: Sentienta

Sentienta simplifies API-based automation with a No-Code solution. By integrating leading APIs and advanced search capabilities, Sentienta agents can access real-time web data without requiring any coding on the user’s part. This approach manages API connections and token authentication, enabling users to assemble expert AI teams with minimal effort.

In an upcoming post, we’ll explore how to build a portfolio management team that factors in real-time market sentiment, financial news, and stock performance—entirely without writing a single line of code.

July 7, 2025
Tips and Tricks: Agent Marketplace

In past posts, we’ve discussed the process of creating agents from scratch. While this is straightforward, there’s a good chance that the agent you need has already been built by someone else. The Agent Marketplace is a library of pre-made agents, allowing you to quickly find and integrate the right one into your team.

To add an agent from the Marketplace, navigate to Manage Teams, select your desired team, and then click on Agent Marketplace in the left menu.

The Agent Marketplace is organized into categories based on the agents’ areas of expertise. Browse through these categories to find an agent that matches your needs. Each agent listing includes a description of its skills and persona. To add an agent, simply check the box next to its name. You can select multiple agents at once—just be sure to click the Add Agents to Selected Teams button at the top of the page. This process helps you assemble a functional team without the effort of manually creating each agent.

While this makes team-building seamless, what’s truly powerful is that Marketplace agents are more than static tools—they’re customizable templates. Once you’ve added an agent, you can refine its persona to better align with your specific objectives.

For example, let’s say you’re assembling a software team to develop a cutting-edge AI product. You’ve added the Rubin agent, but its default persona is too general. You need this agent to specialize in AI development tools. Here’s how to tailor it:

On the Manage Teams page, locate the Rubin agent in the Your Agents and Teams section. Click on the agent’s persona to edit it. Replace the default text with a more specialized persona, such as:

As a Senior Software Designer with expertise in Artificial Intelligence, you will architect and develop advanced AI-driven solutions using state-of-the-art technologies. You will work with machine learning frameworks such as TensorFlow, PyTorch, and Scikit-learn, leveraging APIs like OpenAI’s GPT for AI-powered applications. Additionally, you’ll utilize NLP libraries such as spaCy and Hugging Face for language processing tasks. Expertise in cloud-based AI services (AWS SageMaker, Google Vertex AI, Azure AI) and big data platforms like Apache Spark and Kafka is crucial. Your role includes optimizing AI workflows, integrating intelligent automation into software applications, and guiding best practices for AI model deployment and scalability.

You can also customize the agent’s name—which is useful if you plan to add multiple instances of the same base agent. Additionally, selecting a distinct color for the agent’s responses helps differentiate it in team interactions. To do this, click on the color square in the agent listing and choose a new highlight color. After finalizing your changes, always click Save Changes to apply them.

The Agent Marketplace makes it incredibly easy to build high-performing teams in just a few clicks. Even better, its customization features ensure that your agents are perfectly aligned with your needs. In future posts, we’ll explore agents that integrate with external tools and discuss how to optimize their capabilities through persona refinement.

July 1, 2025
AI’s Real Impacts: Work, Truth, and the Path Forward
Much has been written about the dangers of artificial intelligence. While attention has often focused on the so-called “singularity”—the point at which machines surpass human intelligence and potentially threaten humanity—this scenario remains speculative and distant compared to the present-day impacts of AI, agents, and large language models (LLMs).

This post will focus on two immediate and pressing societal impacts: changes in the workforce and the growing difficulty of verifying the truth of news content.

AI and the Workforce

The influence of AI on employment is progressing much faster than expected. Leading technology firms now attribute significant portions of their coding to LLMs. In May 2025, Microsoft announced 2,000 job cuts, with software engineering representing more than 40% of those positions. Amazon’s CEO has warned that, as the company incorporates more AI tools and agents, its corporate workforce will shrink over the coming years.

We are only witnessing the beginning of this transformation. Fewer traditional white-collar jobs will be available, and graduates in previously “safe” fields—such as engineering, law, and healthcare—will find fewer opportunities. Societal consequences may include increased discontent among the educated, unemployed, and downward pressure on industries supported by high-skilled, well-paid workers. For instance:
- Demand for education in these fields may decrease.
- Fewer professionals in law, medicine, and engineering could lead to higher costs for human-provided services.
- More reliance on AI-driven decisions in critical sectors could diminish human oversight.
- If LLMs primarily recycle existing knowledge rather than create new breakthroughs, innovation itself may slow, leaving society poorer in the long run.
Despite these challenges, positive outcomes are possible. Many highly skilled individuals could respond to job displacement by creating new companies or innovations. Greater use of AI systems could extend high-quality services to under served or rural communities. Routine, machine-generated decisions may become more reliable and accessible for all.

The Loss of Shared Understanding

A more subtle, but perhaps more damaging, impact of AI is the erosion of a shared foundation of truth. The spread of false information, already a concern during the 2016 US presidential election, has only accelerated with AI’s ability to generate realistic text, images, and video.

False content now permeates every aspect of society—commerce, healthcare, science, politics—with examples including:
- An uptick in AI-generated or AI-assisted research papers, some lacking scientific rigor.
- Deepfake videos, such as fabricated footage of protests or conflicts, appearing in news reports.
- The use of AI-generated content as evidence in geopolitical disputes.
These developments deepen societal polarization and promote the spread of misinformation, which gains legitimacy with repeated sharing. As traditional news organizations shrink and fewer journalists are devoted to fact-checking, it becomes more difficult for individuals to find credible information. The “confirmation bias machine” of the internet reinforces beliefs, making each person’s perception of reality more fragmented and isolated.

A healthy society depends on a shared vision grounded in facts. Common values and beliefs foster cooperation and a sense of belonging. For AI to enhance this shared understanding, it must be grounded in factual, well-vetted sources, relying on diverse, independent editorial oversight wherever possible.

Conclusion

As artificial intelligence becomes deeply embedded in our daily lives, society stands at a crossroads. The same technologies that threaten traditional jobs and blur the boundaries of truth also hold the promise of new opportunities and more equitable access to services. Navigating these changes responsibly will require a combination of innovation, adaptability, and a renewed commitment to shared facts and values. How we address the challenges and harness the benefits of AI today will shape not only our workforce and institutions, but also the foundations of our collective understanding for generations to come.
June 25, 2025
Integrating Your Own Data into Team Dialogs

Sentienta agents are based on best-of-class LLMs, which means that they have been trained on vast stores of online content. However, this training does not include current data, nor does it include your proprietary content. In a future post, we’ll discuss how your agent teams can access and utilize current online data, but today I want to talk about loading your own content into your team dialogs.

An Easy Way: Copy-and-Paste

Sentienta provides several mechanisms for entering your content into team discussions. Perhaps the easiest method is to simply copy text that you want your team to know about onto the clipboard and paste it into the query box.

You can add a question about the content to the end of what you’ve pasted so that the team has some context for what you added. This method works for short passages when you want to add perhaps a few paragraphs to the discussion, but is impractical when working with larger documents.

Loading Files for the Team

For larger documents, a better method is to load the file into the dialog. This is done by clicking the paperclip button (located in the toolbar below the query box), and browsing for the file you’d like to load. You can also simply drag-and-drop a file onto the query box.

The query box will tell you that the file content has been loaded, and you can append questions and comments to the content to aid the agents in determining how to use the content for discussion.

The advantage of this approach is that it ensures that all the agents on the team see the same content and have the same context for discussing and using it in subsequent dialogs.

A disadvantage of both this method and the first is that the content doesn’t persist indefinitely. Team dialogs become part of each agent’s semantic memory (as discussed here), but this memory is limited in both size and time.

Persisting Your Content

There are many cases where you want your agents to retain document knowledge indefinitely. For example, an HR agent that maintains company policies and procedures—since these rarely change. Manually reloading these documents regularly is impractical, so Sentienta offers an agent that can store and retrieve files from its own dedicated folder.

To see this in action, add the ‘Ed’ agent from the Agent Marketplace under the Document and Content Access section. Simply select the Ed agent and assign it to a team. This agent provides tools for adding individual files or entire folders. You can manage stored files by listing them and removing any that are no longer needed.

The Ed agent retains these files and can answer questions about them anytime. This approach allows you to load the files once and then add the agent to any team with the stored information. However, unlike the second method discussed, other agents on the team won’t automatically share Ed’s knowledge. Nevertheless, Ed can communicate its information to other agents through the dialog.

Final Thoughts

With the methods we’ve discussed here, you can integrate company-specific documents into team dialogs, ensuring that relevant information is always accessible when solving problems. This approach enhances collaboration and keeps your teams aligned with the most current data.

June 24, 2025