Skip to main content
The get_simplified_html method retrieves a simplified version of the current page’s HTML that has been processed by Narada to optimize it for content extraction and analysis.
Unlike get_full_html, this method returns HTML that has been processed and simplified by Narada, removing unnecessary elements and optimizing the structure for content analysis.

Method Signature

async def get_simplified_html(
    self,
    *,
    timeout: int | None = None,
) -> GetSimplifiedHtmlResponse

Parameters

timeout
int | None
default:"None"
Maximum time in seconds to wait for the simplified HTML retrieval to complete. If None, uses the default system timeout.
timeout=30     # Wait up to 30 seconds
timeout=None   # Use default timeout

Return Value

Returns a GetSimplifiedHtmlResponse object with the following structure:
html
str
required
The simplified HTML content of the current page as a string, processed and optimized by Narada’s backend for content analysis.
# Example response
response.html = "<html><body><main><h1>Title</h1><p>Content...</p></main></body></html>"

Example

import asyncio

from narada import Agent, BrowserEnvironment

async def main():
    env = BrowserEnvironment()
    agent = Agent(environment=env)

    try:
        # Navigate to a page
        await agent.go_to_url(url="https://example.com/article")

        # Get simplified HTML for content analysis
        response = await agent.get_simplified_html()

        # Extract text content from simplified HTML
        from bs4 import BeautifulSoup
        soup = BeautifulSoup(response.html, 'html.parser')
        text_content = soup.get_text(separator=' ', strip=True)

        print(f"Extracted text: {text_content[:200]}...")
    finally:
        await env.close()

if __name__ == "__main__":
    asyncio.run(main())