Skip to main content
The get_simplified_html method retrieves a simplified version of the current page’s HTML that has been processed by Narada to optimize it for content extraction and analysis.
Unlike get_full_html, this method returns HTML that has been processed and simplified by Narada, removing unnecessary elements and optimizing the structure for content analysis.

Method Signature

async def get_simplified_html(
    self,
    *,
    timeout: int | None = None,
) -> GetSimplifiedHtmlResponse

Parameters

timeout
int | None
default:"None"
Maximum time in seconds to wait for the simplified HTML retrieval to complete. If None, uses the default system timeout.
timeout=30     # Wait up to 30 seconds
timeout=None   # Use default timeout

Return Value

Returns a GetSimplifiedHtmlResponse object with the following structure:
html
str
required
The simplified HTML content of the current page as a string, processed and optimized by Narada’s backend for content analysis.
# Example response
response.html = "<html><body><main><h1>Title</h1><p>Content...</p></main></body></html>"

Example

import asyncio
from narada import Narada

async def main():
    async with Narada() as narada:
        window = await narada.open_and_initialize_browser_window()

        # Navigate to a page
        await window.go_to_url(url="https://example.com/article")

        # Get simplified HTML for content analysis
        response = await window.get_simplified_html()

        # Extract text content from simplified HTML
        from bs4 import BeautifulSoup
        soup = BeautifulSoup(response.html, 'html.parser')
        text_content = soup.get_text(separator=' ', strip=True)

        print(f"Extracted text: {text_content[:200]}...")

if __name__ == "__main__":
    asyncio.run(main())