Skip to content

What can i do when i have operation TimeOut ? #679

@karimWorldSpace

Description

@karimWorldSpace

Hi community,

I'm working on a personal project where I need to retrieve the title and HTML content of a webpage (a simple task).

Sometimes, the URL I visit has protections like cookies, but the HTML content is already fully loaded, so I don’t actually care about the cookie. All the information I need is in the HTML.

Here’s my problem:

  1. When I try to evaluate the title of the page, I often get a timeout error.
  2. To handle this, I retry the process after the first failure, but I can’t do more than that ?.
  3. What’s confusing is that if I manually check the title in the browser’s console, it works perfectly. However, when I try to retrieve the title programmatically in PHP, it doesn’t work.
  4. Does anyone know why this might happen or how I can fix it?

Thanks for your help!

Here is my code

` $urls = $urlScrapedByKeyWordRepository->findBy(['isUsedForGeneration' => false]);
shuffle($urls);
$urls = array_slice($urls, 0, 2);

    if ($urls) {
        /** @var UrlScrapedByKeyword[] $urls */
        foreach ($urls as $key => $url) {
            $urlScrapped = ltrim($url->getUrl(), './');
           // $urlScrapped =  $urlScrapped;

            $browser = $this->createBrowser();
            $page = $browser->createPage();
            $html = false;

            try {
                $page->navigate($urlScrapped, ['strict'])->waitForNavigation(Page::INTERACTIVE_TIME, 6000);
                $page->evaluate("console.log('document.title')");

               // -> here where my code crash so i catch the error below
                $pageTitle = $page->evaluate('document.title')->waitForResponse()->getReturnValue();

                if ($pageTitle == 'Before you continue')
                {
                    $this->AcceptGoogleCookies($page);
                    $pageTitle = $page->evaluate('document.title')->waitForResponse()->getReturnValue();
                } 

                echo($pageTitle.' from normal way');
                $pageContent = $page->getHtml(2500);
                sleep(1);
                if ($pageContent) echo('content OK');

                if ($pageTitle == 'Before you continue') $this->AcceptGoogleCookies($page);
            } catch (OperationTimedOut $e) {
                // Here in the console of the navigator, i can see this operation work correctly
               $page->evaluate("console.log(document.title)");

                // !!----catch the error and retry to evaluate title but again crash ----!!
                $pageTitle = $page->evaluate('document.title')->getReturnValue(); 

                if ($pageTitle == 'Before your continue') $this->AcceptGoogleCookies($page);

                echo $pageTitle.' from error';
                $pageContent = $page->getHtml(2500);
                sleep(1);
                if ($pageContent) echo('content OK from error');
            } catch (NavigationExpired $e) {
                echo "Erreur de NavigationExpired lors de l'évaluation du titre : $pageTitle</br>";
            }
        }`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions